New hybrid machine learning forecasts lake ecosystem responses to climate change
Throughout the middle of the 20th century, phosphorus inputs from detergents and fertilizers degraded the water quality of Switzerland's Lake Geneva, spurring officials to take action to remediate pollution in the 1970s.
"The obvious remedy was to reverse the phosphorus loading, and this simple idea helped enormously, but it didn't return the lake to its former state, and that's the problem," said George Sugihara, a biological oceanographer at UC San Diego's Scripps Institution of Oceanography.
Sugihara, Boston University's Ethan Deyle, and three international colleagues spent five years searching for a better way to forecast and manage Lake Geneva's ecological response to the threat of phosphorus pollution, to which the effects of climate change must now be added. The team, including Damien Bouffard of the Swiss Federal Institute of Aquatic Sciences and Technology, published its new hybrid empirical dynamic modeling (EDM) approach on June 20 in the journal Proceedings of the National Academy of Sciences.
"Nature is much more interconnected and interdependent than scientists would often like to think," said Sugihara, the McQuown Chair Professor of Natural Science at Scripps. EDM can help in this context as a form of supervised machine learning, a way for computers to learn patterns and teach researchers about the mechanisms behind the data.
"You pull one lever and everything else changes, whack-a-mole style. Single-factor experiments, the hallmark of 20th-century science where everything is held constant, can teach you a lot in principle, but it is not how the world works," he said.
"If this were not the case, if nature behaved more like the single-factor experiments and was less connected and interdependent, we'd be able to predict outcomes with simple models where relationships don't change."
Interdependence and changing relationships are the reality of ecosystems and they are also the reality of financial markets where prediction is so challenging, Sugihara noted. EDM was honed in the crucible of financial forecasting in the mid 1990s through the early 2000s when Sugihara was a managing director at Deutsche Bank.
Sugihara has drawn upon his financial background to design market tools for supporting sustainable marine fisheries for the last 20 years at Scripps. He calls EDM "math without equations."
But EDM is not a black box method, said Deyle, referring to quantitative methods based on mysterious mathematical or computational formulas. It is a criticism he says is often raised about machine learning.
"Rather, it uses the data to tell you in the most direct way, with minimal assumptions, what is going on. What are the important variables? How do the relationships change through time? It has a mechanism and transparency that comes directly from the data."
What Sugihara's team has attempted departs from traditional modeling methods used in recent decades. As Deyle notes, parts of the well-established models are represented by constants.
"The fixed and constant force of gravity, or the shape and depth of a lake, for example. Consequently, physical processes in the lake can be very well modeled with simple equations," he said.
Not so for the changing ecology and biochemistry.
"The organisms driving change in an ecosystem like Lake Geneva's have changed over the last two decades. The food web has changed, and is constantly changing, along with the lake biochemistry," Bouffard said.
"The standard tools are ill-suited for such problems," said Deyle, who received his Ph.D. in biological oceanography from Scripps Oceanography with adviser Sugihara in 2015.
"Lake Geneva is one of the most well-studied systems in the world. It's not a coincidence that it was an opportunity to push the envelope with a machine-learning approach to ecological forecasting," Deyle said.
The authors demonstrate that their hybrid approach not only leads to substantially better prediction, but also to a more actionable description of the processes (such as biogeochemical and ecological) that drive water quality.
Notably, the hybrid model suggests that the impact on water quality of raising air temperature by 3 degrees Celsius (5.4 degrees Fahrenheit) would be on the same order as the phosphorus pollution of the previous century, and that best management practices may no longer involve a single control lever such as reducing phosphorus inputs alone.
"One of the intellectual cornerstones of all this is minimalism," Sugihara said. "Extracting information out of data with the fewest assumptions."
A simple model that predicts target data yet to be collected is more convincing than a complex model that may agree with current thinking and can be made to "fit" history remarkably well, but does not actually "predict" events yet to be seen. This was the major issue in financial applications, where it is easy to find things that "fit," but nearly impossible to find anything that actually "predicts."
"The more complicated something is, the easier it is to fool yourself," he said. "Our hybrid approach seems to have a balance that works."
Study co-authors include Victor Frossard, Université Savoie Mont Blanc; Robert Schwefel and John Melack, University of California Santa Barbara.
More information: A hybrid empirical and parametric approach for managing ecosystem complexity: Water quality in Lake Geneva under nonstationary futures, Proceedings of the National Academy of Sciences (2022). DOI: 10.1073/pnas.2102466119.
Journal information: Proceedings of the National Academy of Sciences
Provided by University of California - San Diego