Dataset size counts for better climate and environmental predictions

October 11, 2017

A new statistical tool for modeling large climate and environmental datasets that has broad applications—from weather forecasting to flood warning and irrigation management—has been developed by researchers at KAUST.

Climate and environmental datasets are often very large and contain measurements taken across many locations and over long periods. Their large sample sizes and high dimensionality introduce significant statistical and computational challenges. Gaussian process models used in spatial statistics, for example, face considerable difficulty due to the prohibitive computational burden and rely on subsamples or analyze spatial data region by region.

Ying Sun and her PhD student Huang Huang developed a new method that uses a hierarchical low-rank approximation scheme to resolve the computational burden, providing an efficient tool for fitting Gaussian process models to datasets that contain large quantities of climate and environmental measurements.

"One advantage of our is that we apply the low-rank approximation hierarchically when fitting the Gaussian process , which makes analyzing large spatial datasets possible without excessive computation," explains Huang. "The challenge, however, is to retain estimation accuracy by using a computationally efficient approximation."

Traditional low-rank methods are usually computationally fast, but often inaccurate. The researchers, therefore, made the low-rank hierarchical, ensuring that the covariance matrix used to fully characterize dependence in the spatial data is not low rank: this makes it is as fast as traditional methods while significantly improving the accuracy.

To evaluate their model's performance, they undertook numerical analysis and simulations and found the model performs much better than the most commonly used methods. This ensures that credible inferences can be made from real-world datasets.

The model was applied to a spatial of two million soil-moisture measurements from the Mississippi River basin in the United States. They were able to fit a Gaussian model to understand the spatial variability and predict values at unsampled locations. This led to a better understanding of hydrological processes, including runoff generation and drought development, and climate variability for the region.

"Our research provides a powerful tool for the statistical inference of large spatial data, says Sun. "And when exact computations are not possible, environmental scientists could use our methodology to handle large datasets instead of only analyzing subsamples. This makes it a practical and attractive technique for very large and environmental datasets."

Explore further: New Monte Carlo method is computationally more effective for quantifying uncertainty

More information: Hierarchical low rank approximation of likelihoods for large spatial datasets. arxiv.org/abs/1605.08898

Related Stories

Improving connections for spatial analysis

March 7, 2017

A statistical model that accounts for common dependencies in spatial data yields more realistic results for studies of temperature, wind and pollution levels.

Going to extremes to predict natural disasters

July 10, 2017

Predicting natural disasters remains one of the most challenging problems in simulation science because not only are they rare but also because only few of the millions of entries in datasets relate to extreme events. A systematic ...

Team develops method to predict local climate change

February 18, 2016

Global climate models are essential for climate prediction and assessing the impacts of climate change across large areas, but a Dartmouth College-led team has developed a new method to project future climate scenarios at ...

Recommended for you

A not-quite-random walk demystifies the algorithm

December 15, 2017

The algorithm is having a cultural moment. Originally a math and computer science term, algorithms are now used to account for everything from military drone strikes and financial market forecasts to Google search results.

US faces moment of truth on 'net neutrality'

December 14, 2017

The acrimonious battle over "net neutrality" in America comes to a head Thursday with a US agency set to vote to roll back rules enacted two years earlier aimed at preventing a "two-speed" internet.

FCC votes along party lines to end 'net neutrality' (Update)

December 14, 2017

The Federal Communications Commission repealed the Obama-era "net neutrality" rules Thursday, giving internet service providers like Verizon, Comcast and AT&T a free hand to slow or block websites and apps as they see fit ...

The wet road to fast and stable batteries

December 14, 2017

An international team of scientists—including several researchers from the U.S. Department of Energy's (DOE) Argonne National Laboratory—has discovered an anode battery material with superfast charging and stable operation ...

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.