A decline in gene discoveries

A decline in gene discoveries
The outer ring in the diagram visualizes the level of darkness in the human proteome. About 1600 proteins belong to the intensively studied proteome (>500 FPEs), another ~3200 are also well analyzed (>100 FPEs). At the other end of the spectrum are ~4000 proteins not mentioned in any article (super-dark proteome), ~6500 proteins are left with <10FPEs (dark proteome). About 6500 proteins have between 10 and 100 FPEs (illuminated proteome). The middle insert shows the trends how many proteins cross the threshold of 0 (T0), 10 (T10), 20 (T20), 50 (T50), 100 (T100) or 500 (T500) FPEs in a given year. Credit: A*STAR Bioinformatics Institute

The number of papers reporting new protein-function discoveries in 2017 declined by two-thirds compared with 2000 output, according to research led by A*STAR.

While the Human Genome Project has made the entire human genetic code available to researchers, making sense of this vast trove of data is challenging.

"For many biologists, of a gene function completely changes their lives—it is their main scientific achievement," says Frank Eisenhaber, director of A*STAR's Bioinformatics Institute (BII), who led the study.

The BII team, together with Lars Juhl Jensen from the University of Copenhagen, wanted to explore how the rate of new gene structure and function discoveries changed between 1901 and 2017, by looking at how many papers and patents appeared in the biomedical literature describing previously unknown gene and protein function discoveries.

To do this, they came up with a score, called a 'full publication equivalent' or FPE, representing the published equivalent of one whole paper dedicated solely to a single genomic entity, whether a gene, a protein, or a non-coding RNA.

Overall, they found references to 17,824 human proteins and 2,641 human noncoding RNAs in the literature over that period. Of these proteins, 1,610 proteins (9 per cent) scored more than 500 FPEs and accounted for 78 per cent of all relevant papers published. Some of the most frequently mentioned proteins included insulin, serum albumin, tumor necrosis factor and p53.

A further 16 per cent of the literature was dedicated to another 3,207 proteins (18 per cent of the total), which scored between 100 and 500 FPEs. Just over one-third of all proteins mentioned in the literature—6,439 genomic entities—had 10—100 full FPEs. But only 6 per cent of the literature was left to cover more than 13,000 genomic entities.

The rate of protein function discoveries over time steadily increased from 1980—2000, such that by the year 2000, there were around 500 new protein names being reported in the literature each year.

"The appearance of a new gene name in the literature means that there is a new opening and people seriously start thinking what this gene might mean in terms of physiology and biomedical application," Frank Eisenhaber says.

Then in 2000, it changed. Despite the fact that the draft human genome sequence became available in 2001, which should have made genomic discoveries easier, the publication rate began a sustained decline. In 2017, the number of genes appearing in the literature for the first time was one-third of the number of genes that appeared in the in 2000.

"That's a huge drop," Frank Eisenhaber says. "And since function discoveries mainly come from elite institutions, it means they are also affected on a great scale, and that this is a worldwide phenomenon."

He suggests that the decline in new gene and publications may be the result of a diversion in research funding from core budgets towards more short-term, grant-based funding, as well as shorter contracts for academic and research staff.

"For a well-characterized gene, plasmids and antibodies and everything is available, whereas for new genes, you don't have an antibody or plasmid, you need to produce them yourself," he says. "It can easily take another year to technically prepare the research besides the scientific challenges when nothing is known, but if you have only two years of a post-doc, can you afford the time to do that?"

The concern is that focusing so much research effort and funding on known and their function will leave large areas of the human genome in darkness, and reduce scientists' ability to explore the full and structure of our genetic material and apply these results for biomedical benefits.

Explore further

Scientists study genes misidentified as 'non-protein coding'

More information: Swati Sinha et al. Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000, Proteomics (2018). DOI: 10.1002/pmic.201800093
Journal information: Proteomics

Citation: A decline in gene discoveries (2019, February 22) retrieved 31 October 2020 from https://phys.org/news/2019-02-decline-gene-discoveries.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Feedback to editors

User comments