Computer scientists at the University of California, Riverside are working with a doctor at Children's Hospital Los Angeles to mine data collected from pediatric intensive care units in hopes of helping doctors treat children and cutting health care costs.
The researchers, led by Eamonn Keogh, a computer science professor at UC Riverside's Bourns College of Engineering, have received a four-year, $1.2 million grant from the National Science Foundation.
"This data has the potential to be a gold mine of useful – literally life saving – information," said Keogh, who specializes in data mining, which involves searching for patterns and irregularities in large data sets.
He is working with: Dr. Randall Wetzel, of Children's Hospital Los Angeles; Walid Najjar and Vasilis Tsotras, both computer science professors at UC Riverside; and David Kale, one of Keogh's graduate students.
Today, in pediatric intensive care units across the nation sensors are attached to children to record up to 30 measurements, such as pulse rate, blood pressure and temperature. The sensors allow for real-time monitoring of the child and can trigger an alarm if, for example, a child's temperature exceeds 100 degrees Fahrenheit.
Usually, the sensors only display the last few minutes of data and figures such as the minimum and maximum temperature for that day. In most cases, the rest of the data is discarded.
This is in part due to legal and privacy issues, which the researchers believe can be solved. It's also because computer scientists didn't have the tools to mine the vast amounts of data produced in pediatric intensive care units.
That changed after Keogh and a group of researchers recently developed a new technique, which allows for searching of datasets with more than one trillion objects. That's a larger set than the combined size of all datasets in all data mining papers ever published.
The new technique was outlined in a paper "Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping." It was named best paper at the ACM SIGKDD data mining conference in August in Beijing.
During the past five years, Children's Hospital Los Angeles has collected data from its pediatric intensive care units. It's typically sampled once every 30 seconds. This dataset includes more than one billion individual measurements.
With the support of the grant, Children's Hospital Los Angeles plans to explore options to capture and store data from five or more sensors and capture multiple data points per second.
In the coming years, Keogh and the team of researchers plan to investigate two areas, which are interconnected.
One is mining the archived pediatric intensive care unit data from Children's Hospital Los Angeles to find regularities and patterns than can aid doctors in diagnosing and predicting medical episodes.
The second is taking the regularities and patterns they discovered and incorporating them real time into intensive care unit sensors to see if they help doctors.
Keogh plans to use the archived data to develop algorithms that incorporate what he calls "if then rules" that can assist doctors. For example, if a heart beat looks like this, then a child may have difficulty breathing in five seconds.
The difficulty, Keogh said, is to find medically useful patterns because there are an infinite number of trivial patterns, such as people who tend to have babies are female and people over six-feet tall are over five-feet tall.
"We have to find those that aren't known but are useful and that can benefit from intervention," Keogh said. "That will be tricky."
Explore further: New programming language for fast simulations