Solar flares can release the energy equivalent of many atomic bombs, enough to cut out satellite communications and damage power grids on Earth, 93 million miles away. The flares arise from twisted magnetic fields that occur all over the sun's surface, and they increase in frequency every 11 years, a cycle that is now at its maximum.
Using artificial intelligence techniques, Stanford solar physicists Monica Bobra and Sebastien Couvidat have automated the analysis of the largest ever set of solar observations to forecast solar flares using data from the Solar Dynamics Observatory (SDO), which takes more data than any other satellite in NASA history. Their study identifies which features are most useful for predicting solar flares.
Specifically, their study required analyzing vector magnetic field data. Historically, instruments measured the line-of-sight component of the solar magnetic field, an approach that showed only the amplitude of the field. Later, instruments showed the strength and direction of the fields, called vector magnetic fields, but for only a small part of the sun, or part of the time. Now an instrument on a satellite-based system, the Helioseismic Magnetic Imager (HMI) aboard SDO, collects vector magnetic fields and other observations of the entire sun almost continuously.
Adding machine learning
The Stanford Solar Observatories Group, headed by physics Professor Phil Scherrer, processes and stores the SDO data, which takes 1.5 terabytes of data a day. During a recent afternoon tea break, the group members chatted about what they might do with all that data and talked about trying something different.
They recognized the difficulty of forming predictions from many data points when using pure theory and they had heard of the popularity of the online class on machine learning taught by Andrew Ng, a Stanford professor of computer science.
"Machine learning is a sophisticated way to analyze a ton of data and classify it into different groups," Bobra said.
Machine learning software ascribes information to a set of established categories. The software looks for patterns and tries to see which information is relevant for predicting a particular category.
For example, one could use machine-learning software to predict whether or not people are fast swimmers. First, the software looks at features of swimmers – their heights, weights, dietary habits, sleeping habits, their dogs' names and their dates of birth.
And then, through a guess and check strategy, the software would try to identify which information is useful in predicting whether or not a swimmer is particularly speedy. It could look at a swimmer's height and guess whether that particular height lies within the height range of speedy swimmers, yes or no. If it guessed correctly, it would "learn" that the height might be a good predictor of speed.
The software might find that a swimmer's sleeping habits are good predictors of speed, whereas the name of the swimmer's dog is not.
The predictions would not be very accurate after analysis of just the first few swimmers. The more information provided, the better machine learning gets at predicting.
Similarly, the researchers wanted to know how successfully machine learning would predict the strength of solar flares from information about sunspots.
"We had never worked with the machine learning algorithm before, but after we took the course we thought it would be a good idea to apply it to solar flare forecasting," Couvidat said. He applied the algorithms and Bobra characterized the features of the two strongest classes of solar flares, M and X. Though others have used machine learning algorithms to predict solar flares, nobody has done it with such a large set of data and or with vector magnetic field observations.
M-class flares can cause minor radiation storms that might endanger astronauts and cause brief radio blackouts at Earth's poles. X-class flares are the most powerful.
Better flare prediction
The researchers catalogued flaring and non-flaring regions from a database of more than 2,000 active regions and then characterized those regions by 25 features such as energy, current and field gradient. They then fed the learning machine 70 percent of the data, to train it to identify relevant features. And then they used the machine to analyze the remaining 30 percent of the data to test its accuracy in predicting solar flares.
Machine learning confirmed that the topology of the magnetic field and the energy stored in the magnetic field are very relevant to predicting solar flares. Using just a few of the 25 features, machine learning discriminated between active regions that would flare and those that would not flare. Although others have used different methods to come up with similar results, machine learning provides a significant improvement because automated analysis is faster and could provide earlier warnings of solar flares.
However, this study only used information from the solar surface. That would be like trying to predict Earth's weather from only surface measurements like temperature, without considering the wind and cloud cover. The next step in solar flare prediction would be to incorporate data from the sun's atmosphere, Bobra said.
Doing so would allow Bobra to pursue her passion for solar physics. "It's exciting because we not only have a ton of data, but the images are just so beautiful," she said. "And it's truly universal. Creatures from a different galaxy could be learning these same principles."
Explore further: NASA releases images of first notable solar flare of 2015
Study paper: iopscience.iop.org/0004-637X/798/2/135