New statistical models key yield powerful insight from health care databases

August 3, 2017, American Statistical Association

Recognizing that administrative health care databases can be a valuable, yet challenging, tool in the nation's ongoing pursuit of personalized medicine, statisticians Liangyuan Hu and Madhu Mazumdar of the Icahn School of Medicine at Mount Sinai have developed advanced statistical modeling and analytic tools that can make health care and medical data more meaningful. Hu will present their findings August 3 at the 2017 Joint Statistical Meetings (JSM) in Baltimore, Md.

The availability of large electronic health records is promising for medical discovery and efforts to develop individualized treatments. "Powerful statistical analyses and results from these records and databases can be the foundation on which informed medical questions are asked and decisions are made," notes Hu.

For example, doctors seeking to provide optimal treatment for high-risk cancer patients could consider multiple radical prostatectomy (RP) or radiotherapy (RT) modalities. But, since it is difficult to conduct that would yield quality results comparing RP to RT for long-term survival among such a high-risk group, physicians are limited to the available data that can help them make precise, customized decisions. "Therefore, finding evidence using statistical tools from large, representative national databases is crucial to inform such critical medical decisions," says Hu.

Demonstrating with a case study in chronic diseases, Hu will show challenges typically associated with drawing inferences from electronic health records and administrative databases. Limitations such as uncontrolled data collection settings, practice variation among physicians and missing data can lead to false conclusions, if not addressed properly by rigorous statistical methods. Their methods leverage machine learning and flexible models to draw valid inference using sampled from a representative population and reflect outcome from actual clinical practice.

"In clinical prediction studies, we show that combining strengths of nonparametric algorithms and parametric models leads to the development of a data-driven and reproducible that will not only generate immediate public impact, but also advance developments in statistical methodology pertaining to drawing valid and useful information from vast data sources," concludes Hu.

Explore further: Going to extremes to predict natural disasters

Related Stories

Going to extremes to predict natural disasters

July 10, 2017

Predicting natural disasters remains one of the most challenging problems in simulation science because not only are they rare but also because only few of the millions of entries in datasets relate to extreme events. A systematic ...

Recommended for you


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.