Big data used to predict the future

November 12, 2018, University of Córdoba
Credit: CC0 Public Domain

Technology is moving in giant leaps and bounds, and with it, the information with which society operates daily. Nevertheless, the volume of data needs to be organized, analyzed and correlated to predict certain patterns. This is one of the main functions of what is known as Big Data.

Researchers in the KIDS research group from the University of Cordoba's Department of Computer Science and Numerical Analysis were able to improve the models that predict several variables simultaneously based on the same set of input variables, thus reducing the size of data necessary for an accurate forecast. One example of this is a method that predicts several parameters related to soil quality based on a set of variables such as crops planted, tillage and the use of pesticides.

"When you are dealing with a large volume of data, there are two solutions. You either increase computer performance, which is very expensive, or you reduce the quantity of information needed for the process to be done properly," says researcher Sebastian Ventura, one of the authors of the research article.

When building a predictive model, reliable results depend on two issues: the number of variables that come into play and the number of examples entered into the system. With the idea that less is more, the study has been able to reduce the number of examples by eliminating those that are redundant or "noisy," and that therefore do not contribute any useful information for the creation of a better .

As Oscar Reyes, the lead author of the research, points out "we have developed a technique that can tell you which set of examples you need so that the forecast is not only reliable but could even be better." In some databases, of the 18 that were analyzed, they were able to reduce the amount of by 80 percent without affecting the predictive performance, meaning that less than half the original data was used. All of this, says Reyes, "means saving energy and money in the building of a model, as less computing power is required." In addition, it also means saving time, which is interesting for applications that work in real-time, since "it doesn't make sense for a model to take half an hour to run if you need a prediction every five minutes."

Systems that predict several related variables simultaneously, known as multi-output regression models, are gaining more notable importance due to the wide range of applications that could be analyzed under this paradigm of automatic learning, such as those related to healthcare, water quality, cooling systems for buildings and environmental studies.

Explore further: Team finds new method to improve predictions

More information: Oscar Reyes et al, An ensemble-based method for the selection of instances in the multi-target regression problem, Integrated Computer-Aided Engineering (2018). DOI: 10.3233/ICA-180581

Related Stories

Team finds new method to improve predictions

November 30, 2016

Researchers at Princeton, Columbia and Harvard have created a new method to analyze big data that better predicts outcomes in health care, politics and other fields.

Can models predict grid tolerance to environmental extremes?

October 30, 2017

Understanding the environmental conditions associated with stress on the electric grid has important practical considerations, but also represents a complex scientific and modeling challenge. A research team led by scientists ...

The power of wind energy and how to use it

January 26, 2017

Wind offers an immense, never-ending source of energy that can be successfully harnessed to power all of the things that currently draw energy from nonrenewable resources. But wind frequency varies with weather patterns.

Why the weather forecast will always be a bit wrong

August 23, 2018

The science of weather forecasting falls to public scrutiny every single day. When the forecast is correct, we rarely comment, but we are often quick to complain when the forecast is wrong. Are we ever likely to achieve a ...

Recommended for you

Coffee-based colloids for direct solar absorption

March 22, 2019

Solar energy is one of the most promising resources to help reduce fossil fuel consumption and mitigate greenhouse gas emissions to power a sustainable future. Devices presently in use to convert solar energy into thermal ...

EPA adviser is promoting harmful ideas, scientists say

March 22, 2019

The Trump administration's reliance on industry-funded environmental specialists is again coming under fire, this time by researchers who say that Louis Anthony "Tony" Cox Jr., who leads a key Environmental Protection Agency ...

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.