New gene prediction method capitalizes on multiple genomes

December 20, 2007

Researchers at Stanford University report in the online open access journal, Genome Biology, a new approach to computationally predicting the locations and structures of protein-coding genes in a genome. Gene finding remains an important problem in biology as scientists are still far from fully mapping the set of human genes.

Furthermore, gene maps for other vertebrates, including important model organisms such as mouse, are much more incomplete than the human annotation. The new technique, known as CONTRAST (CONditionally TRAined Search for Transcripts), works by comparing a genome of interest to the genomes of several related species.

CONTRAST exploits the fact that the functional role protein-coding genes play a specific part within a cell and are therefore subjected to characteristic evolutionary pressures. For example, mutations that alter an important part of a protein's structure are likely to be deleterious and thus selected against. On the other hand, mutations that preserve a protein's amino acid sequence are normally well tolerated. Thus, protein-coding genes can be identified by searching a genome for regions that show evidence such patterns of selection. However, learning to recognize such patterns when more than two species are compared has proved difficult.

Previous systems for gene prediction were able to effectively make use of one additional 'informant' genome. For example, when searching for human genes, taking into account information from the mouse genome led to a substantial increase in accuracy. But, no system was able to leverage additional informant genomes to improve upon state-of-the-art performance using mouse alone, although it was expected that adding informants would make patterns of selection clearer.

CONTRAST solves this problem by learning to recognize the signature of protein-coding gene selection in a fundamentally different way from previous approaches. Instead of constructing a model of sequence evolution, CONTRAST directly 'learns' which features of a genomic alignment are most useful for recognizing genes. This approach leads to overall higher levels of accuracy and is able to extract useful information from several informant sequences.

In a test on the human genome, CONTRAST exactly predicted the full structure of 59% of the genes in the test set, compared with the previous best result of 36%. Its exact exon sensitivity of 93%, compared with a previous best of 84%, translates into many thousands of exons correctly predicted by CONTRAST but missed by previous methods. Importantly, CONTRAST's accuracy using a combination of eleven informant genomes was significantly higher than its accuracy using any single informant. The substantial advance in predictive accuracy represented by CONTRAST will further efforts to complete protein-coding gene maps for human and other organisms.

Further information about existing gene-prediction methods and the advance CONTRAST brings to the field can be found in a minireview by Paul Flicek, which accompanies the article by Batzoglou and colleagues.

Source: BioMed Central

Explore further: Out of the lamplight

Related Stories

Out of the lamplight

July 31, 2015

The human body is governed by complex biochemical circuits. Chemical inputs spur chain reactions that generate new outputs. Understanding how these circuits work—how their components interact to enable life—is critical ...

Cell aging slowed by putting brakes on noisy transcription

July 30, 2015

Working with yeast and worms, researchers found that incorrect gene expression is a hallmark of aged cells and that reducing such "noise" extends lifespan in these organisms. The team published their findings this month in ...

Making sense of our evolution

July 13, 2015

The science about our our special senses - vision, smell, hearing and taste - offers fascinating and unique perspectives on our evolution.

Recommended for you

Earth flyby of 'space peanut' captured in new video

July 31, 2015

NASA scientists have used two giant, Earth-based radio telescopes to bounce radar signals off a passing asteroid and produce images of the peanut-shaped body as it approached close to Earth this past weekend.

How bees naturally vaccinate their babies

July 31, 2015

When it comes to vaccinating their babies, bees don't have a choice—they naturally immunize their offspring against specific diseases found in their environments. And now for the first time, scientists have discovered how ...

Image: Hubble sees a dying star's final moments

July 31, 2015

A dying star's final moments are captured in this image from the NASA/ESA Hubble Space Telescope. The death throes of this star may only last mere moments on a cosmological timescale, but this star's demise is still quite ...

Exoplanets 20/20: Looking back to the future

July 31, 2015

Geoff Marcy remembers the hair standing up on the back of his neck. Paul Butler remembers being dead tired. The two men had just made history: the first confirmation of a planet orbiting another star.

New blow for 'supersymmetry' physics theory

July 27, 2015

In a new blow for the futuristic "supersymmetry" theory of the universe's basic anatomy, experts reported fresh evidence Monday of subatomic activity consistent with the mainstream Standard Model of particle physics.

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.