Young and Karr propose ways to improve how observational studies are conducted

August 25, 2011

S. Stanley Young, assistant director for bioinformatics at the National Institute of Statistical Sciences (NISS), and Alan Karr, director at NISS, have published a non-technical article in the September issue of Significance magazine pointing out that medical and other observational studies often produce results that are later shown to be incorrect, and—invoking a quality control perspective—suggest ways to fix the system.

Their central point is that the current system of publication in peer-reviewed journals relies on post-production inspection to ensure quality, a practice that has disappeared from modern industry in favor of controlling the process instead: quality control is now process control, not product control. They cite W. Edwards Deming, considered by many the most innovative thinker ever about quality, arguing not only for process control, but also that the problem lies with the managers—funders and journals—rather with than the workers—individual researchers who respond rationally to the current set of incentives.

Young and Karr describe both their and others' studies of the extent to which observational studies do not replicate. Published claims such as "coffee causes pancreatic cancer," or "women eating breakfast cereal are more likely to have boy babies," have been refuted by subsequent studies and analyses. When these studies reach the popular media and influence individual consumers, the burden falls not just on science but also on society. And even if there were no impact on the public, scarce research resources, both money and personnel, have been squandered.

The paper describes several technical difficulties with observational studies, among them multiple testing (if enough questions are asked, some will yield false positive answers), bias (systematic error) and multiple modeling (searching among mathematical models until one is found that "fits the data"). Publication bias is another issue: papers reporting positive scientific results (for example, an association between Type A personalities and heart attacks) are more likely to be published than those reporting negative results, even though the latter may be as important scientifically.

Young and Karr recommend that when a study is submitted for publication, the data be split into two sets, a modeling data set and a holdout data set. Journals would then accept or reject papers based on the analysis of the modeling data set without knowing the results of applying the methods to the holdout set. But then the journal would also publish an addendum to the paper giving the results of the analysis of the holdout set.

Explore further: Will eating certain cereal result in male babies?

More information: Significance magazine is published by the Royal Statistical Society of the UK and the American Statistical Society. A copy of the article will be made available at:

Related Stories

Will eating certain cereal result in male babies?

January 14, 2009

Could eating cereal really make it more likely for someone to have a boy baby than a girl baby? Researchers wrote a paper, "Cereal-Induced Gender Selection? Most Likely a Multiple Testing False Positive," that will be published ...

For a less biased study, try randomization

April 13, 2011

A new review of existing research confirms that the so-called “gold standard” of medical research — the randomized controlled study — provides a safeguard against bias.

Detecting bias in the reporting of clinical trials

August 19, 2009

A study by researchers at the University of Leicester has revealed new ways to spot whether medical research has hidden biases. Writing in the prestigious British Medical Journal, Santiago Moreno and his colleagues demonstrate ...

Recommended for you

Ancient parrot fossil found in Siberia

October 26, 2016

(—A Russian paleontologist has discovered a parrot fossil uncovered in Siberia several years ago—the first evidence of parrots living in Asia. In his paper published in Biology Letters, Nikita Zelenkov describes ...

Ancient burials suggestive of blood feuds

October 24, 2016

There is significant variation in how different cultures over time have dealt with the dead. Yet, at a very basic level, funerals in the Sonoran Desert thousands of years ago were similar to what they are today. Bodies of ...

1 comment

Adjust slider to filter visible comments by rank

Display comments: newest first

not rated yet Sep 10, 2011
Significance costs $150 a year subscription. It is free if you belong to the American Statistical Society or the Royal Statistical Society. Some libraries may have free access.
For all the rest of us unemployed have-nots, it costs too much.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.