Scientists examine reproducibility of research issues and remedies
Reproducibility of scientific findings has long been an important indicator of the validity of data gleaned from research, a process deemed even more critical in this age of ever-changing technologies and methods.
At the Sackler Colloquium focused on "Reproducibility of Research: Issues and Proposed Remedies," David B. Allison, dean of the Indiana University School of Public Health-Bloomington, and co-organizers Richard Shiffrin, IU Distinguished Professor and Luther Dana Waterman Professor of Psychological and Brain Sciences; Victoria Stodden of the University of Illinois and the late Stephen Fienberg of Carnegie Mellon University invited discussion on the main topics of defining reproducibility in various research contexts and providing remedies that contribute to greater reproducibility and transparency.
A dozen articles shaping the colloquium were published in the March 2018 special issue of Proceedings of the National Academy of Sciences and covered a comprehensive range of subjects from the collection of data to the dissemination of findings from both the scientific and non-scientific communities.
Featured among them is "Scientific Progress Despite Irreproducibility: A Seeming Paradox," by Shiffrin; Katy Börner, IU Distinguished Professor of Engineering and Information Science and Stephen M. Stigler of the University of Chicago. At a time in which scientific advances are taking place daily and governing much of modern society, it seems contradictory that the way science is practiced makes quite a few reported findings difficult or impossible to reproduce. However, scientific practice has continued to evolve and advance despite elemental problems, not the least of which being that a number of years may have to pass before the validity, importance and usability of findings can be properly ascertained.
The researchers acknowledge that it is often possible for even invalid findings and conclusions to be reproduced but note that the repetition of fundamental—and frequently avoidable—errors is usually at the core of such a result. Therefore, to maximize progress, they recommend that a balance be struck between the sharing of promising early-stage results and the maintaining of strict vigilance to ensure quality reporting. Shared data must be not only reliable and important, but also of true scientific value with the ability to generalize to similar settings.
The authors examine proposed new remedies designed to reduce the degree of irreproducibility—such as demanding preregistration of studies to prevent "cherry-picking" data—and recommend reforms tailored to research type and goals. This stance is bolstered by the varying consequences that are possible if invalid results are published: findings gleaned during the exploration process would be unlikely to bring about harm, although a faulty real-life application of conclusions could cause serious damage. They caution, however, that any remedy and reform can have unintended consequences that can slow rather than speed scientific progress.
Also highlighted in the PNAS issue is a report by Andrew Brown, assistant professor of applied health science at the IU School of Public Health-Bloomington; Kathryn A. Kaiser, assistant professor in health behavior at the University of Alabama at Birmingham and Allison, titled "Issues with Data and Analyses: Errors, Underlying Themes, and Potential Solutions," which focuses on errors that could conceivably have been avoided by application of good established practices.
The researchers discuss such influences as citation and publication bias, mathematical miscalculations, errors in interpretation and working with bad data that was obtained through questionable methods, designs or techniques. Incorrect management and storage of data can also affect the ability to confirm findings, as can errors of communication and logic. They stress that consequences of invalid interpretation and reporting range from the loss of public trust to the potential loss of life.
The merits and drawbacks of the peer review process as a reliable upholder of scientific integrity in the literature are also deliberated. Despite such challenges as what the authors term "an unknown unknown"—meaning that reviewers read what is presented in manuscripts without having additional information (or perhaps even complete information) with which to evaluate potential errors—being inherent in this process, Brown and the team maintain that there are certain actions and methods that well-trained scientists should and could recognize as erroneous and/or lacking in rigor.
Notes Brown, "The entire colloquium focused on research rigor and reproducibility. In particular, we were interested in identifying challenges and limitations to rigor and reproducibility, but, more importantly we were interested in identifying paths forward to make science more rigorous and reproducible. We need to start from a position that scientific thinking is still our best way of coming to have objective knowledge of the world. I think we as a scientific enterprise have done a thorough job noting that reproducibility is a problem, even if more work needs to be done to determine underlying causes, more rigorously catalog errors and pitfalls, and identify the best interventions for reinforcing scientific rigor. However, we know that there are incremental and structural changes we can make now to improve the present condition of scientific investigations and publications to fulfill the mantra that science is self-correcting."
Additional articles featured in the PNAS special issue are:
- "Empirical Confidence Interval Calibration for Population-Level Effect Estimation Studies in Observational Healthcare Data," by Martijn J. Schuemie, George Hripcsak, Patrick B. Ryan, David Madigan and Marc A. Suchard.
- "Training Replicable Predictors in Multiple Studies," by Prasad Patil and Giovanni Parmigiani.
- "An Empirical Analysis of Journal Policy Effectiveness for Computational Reproducibility," by Victoria Stodden, Jennifer Seiler and Zhaokun Ma.
- "Standards for Design and Measurement Would Make Clinical Research Reproducible and Usable," by Kay Dickersin and Evan Mayo-Wilson.
- "Enhancing Primary Reports of Randomized Controlled Trials: Three Most Common Challenges and Suggested Solutions," by Guowei Li et al.
- "The Preregistration Revolution," by Brian A. Nosek, Charles R. Ebersole, Alexander C. DeHaven, and David T. Mellor.
- "Metastudies for Robust Tests of Theory," by Beth Baribault et al.
- "Misrepresentation and Distortion of Research in Biomedical Literature," by Isabelle Boutron and Philippe Ravaud.
- "Crisis or Self-correction: Rethinking Media Narratives about the Well-being of Science," by Kathleen Hall Jamieson.
- "Is Science Really Facing a Reproducibility Crisis, and Do We Need It To?," by Daniele Fanelli.