Artificial intelligence project to help bring the power of the sun to Earth is picked for first U.S. exascale system
To capture and control the process of fusion that powers the sun and stars in facilities on Earth called tokamaks, scientists must confront disruptions that can halt the reactions and damage the doughnut-shaped devices. Now an artificial intelligence system under development at the U.S. Department of Energy's (DOE) Princeton Plasma Physics Laboratory (PPPL) and Princeton University to predict and tame such disruptions has been selected as an Aurora Early Science project by the Argonne Leadership Computing Facility, a DOE Office of Science User Facility.
The project, titled "Accelerated Deep Learning Discovery in Fusion Energy Science" is one of 10 Early Science Projects on data science and machine learning for the Aurora supercomputer, which is set to become the first U.S. exascale system upon its expected arrival at Argonne in 2021. The system will be capable of performing a quintillion (1018) calculations per second—50-to-100 times faster than the most powerful supercomputers today.
Fusion combines light elements
Fusion combines light elements in the form of plasma—the hot, charged state of matter composed of free electrons and atomic nuclei—in reactions that generate massive amounts of energy. Scientists aim to replicate the process for a virtually inexhaustible supply of power to generate electricity.
The goal of the PPPL/Princeton University project is to develop a method that can be experimentally validated for predicting and controlling disruptions in burning plasma fusion systems such as ITER—the international tokamak under construction in France to demonstrate the practicality of fusion energy. "Burning plasma" refers to self-sustaining fusion reactions that will be essential for producing continuous fusion energy.
Heading the project will be William Tang, a principal research physicist at PPPL and a lecturer with the rank and title of professor in the Department of Astrophysical Sciences at Princeton University. "Our research will utilize capabilities to accelerate progress that can only come from the deep learning form of artificial intelligence," Tang said.
Networks analogous to a brain
Deep learning, unlike other types of computational approaches, can be trained to solve with accuracy and speed highly complex problems that require realistic image resolution. Associated software consists of multiple layers of interconnected neural networks that are analogous to simple neurons in a brain. Each node in a network identifies a basic aspect of data that is fed into the system and passes the results along to other nodes that identify increasingly complex aspects of the data. The process continues until the desired output is achieved in a timely way.
The PPPL/Princeton deep-learning software is called the "Fusion Recurrent Neural Network (FRNN)," composed of convolutional and recurrent neural nets that allow a user to train a computer to detect items or events of interest. The software seeks to speedily predict when disruptions will break out in large-scale tokamak plasmas, and to do so in time for effective control methods to be deployed.
The project has greatly benefited from access to the huge disruption-relevant data base of the Joint European Torus (JET) in the United Kingdom, the largest and most powerful tokamak in the world today. The FRNN software has advanced from smaller computer clusters to supercomputing systems that can deal with such vast amounts of complex disruption-relevant data. Running the data aims to identify key pre-disruption conditions, guided by insights from first principles-based theoretical simulations, to enable the "supervised machine learning" capability of deep learning to produce accurate predictions with sufficient warning time.
Access to Princeton's Tiger cluster
The project has gained from access to Tiger, a high-performance Princeton University cluster equipped with advanced image-resolution GPUs that have enabled the deep learning software to advance to the Titan supercomputer at Oak Ridge National Laboratory and to powerful international systems such as the Tsubame-III supercomputer in Tokyo, Japan. The overall goal is to achieve the challenging requirements for ITER, which will need predictions to be 95 percent accurate with less than 5 percent false alarms at least 30 milliseconds or longer before disruptions occur.
The team will continue to build on advances that are currently supported by the DOE while preparing the FRNN software for Aurora exascale computing. The researchers will also move forward with related developments on the SUMMIT supercomputer at Oak Ridge.
Members of the team include Julian Kates-Harbeck, a graduate student at Harvard University and a DOE Office of Science Computational Science Graduate Fellow (CSGF) who is the chief architect of the FRNN. Researchers include Alexey Svyatkovskiy, a big-data, machine learning expert who will continue to collaborate after moving from Princeton University to Microsoft; Eliot Feibush, a big data analyst and computational scientist at PPPL and Princeton, and Kyle Felker, a CSGF member who will soon graduate from Princeton University and rejoin the FRNN team as a post-doctoral research fellow at Argonne National Laboratory.