AI and big data predict which research will influence future medical treatments

AI and big data predict which research will influence future medical treatments
This image depicts the co-citation network of seminal fundamental publications that led to the clinical development of cancer immunotherapy treatments. Large dots (center) represent the most influential clinical trials that formed part of the evidence base for FDA approval of these treatments. Heat mapping indicates the extent to which the research was human-focused; at the extremes, each green dot represents a fundamental research publication and each red dot a publication describing human research. This network was generated using open access data from the new modules of the iCite webtool described in two new articles from Hutchins and colleagues. Credit: Ian Hutchins and George Santangelo

An artificial intelligence/machine learning model to predict which scientific advances are likely to eventually translate to the clinic has been developed by Ian Hutchins and colleagues in the Office of Portfolio Analysis (OPA), a team led by George Santangelo at the National Institutes of Health (NIH). This work, described in a Meta-Research article published October 10 in the open-access journal PLOS Biology, aims to decrease the sometimes decades-long interval between scientific discovery and clinical application; the method determines the likelihood that a research article will be cited by a future clinical trial or guideline, an early indicator of translational progress.

Hutchins and colleagues have quantified these predictions, which are highly accurate with as little as two years of post-publication data, as a novel metric called "Approximate Potential to Translate" (APT). APT values can be used by researchers and to focus attention on areas of science that have strong signatures of translational potential. Although numbers alone should never be a substitute for evaluation by human experts, the APT metric has the potential to accelerate biomedical progress as one component of data-driven decision-making.

The model that computes APT values makes predictions based upon the content of research articles and the articles that cite them. A long-standing barrier to research and development of metrics like APT is that such citation data has remained hidden behind proprietary, restrictive, and often costly licensing agreements. To disrupt this impediment to the , to increase transparency, and to facilitate reproducibility, OPA has aggregated citation data from publicly available resources to create an open citation collection (NIH-OCC), the details of which appear in a Community Page article in the same issue of PLOS Biology. The NIH-OCC comprises over 420 million citation links at present and will be updated monthly as citations continue to accumulate. For publications since 2010, the NIH-OCC is already more comprehensive than leading proprietary sources of citation data.

Citation data from the NIH-OCC are used to calculate both APT values and Relative Citation Ratios (RCRs). The latter, a measure of scientific influence at the article level, normalized for the field of study and time since publication, was developed previously by Santangelo's team at NIH, and has already been widely adopted in both the scientific and evaluator communities. Upon publication of these two articles, APT values and the NIH-OCC will be freely and publicly available as new components of the iCite webtool that will continue as the primary source of RCR data ( The OPA team encourages the use of iCite to improve research assessment and decision-making that can contribute to optimizing the scientific enterprise.

More information: Meta-Research Article: Hutchins BI, Davis MT, Meseroll RA, Santangelo GM (2019) Predicting translational progress in biomedical research. PLoS Biol 17(10): e3000416.

Journal information: PLoS Biology

Citation: AI and big data predict which research will influence future medical treatments (2019, October 10) retrieved 25 June 2024 from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Relative Citation Ratio: Scientists publish new metric to measure the influence of scientific research


Feedback to editors