New method helps researchers decode genomes

August 28, 2012 by Krishna Ramanujan
In cells, a messenger RNA (purple) is decoded by ribosomes (beige), which in turn produce the desired chains of amino acids that make up proteins, called polypeptides (pink coil). In order to understand where a gene's code begins, two related but distinct translation inhibitors (red and green) are used by the researchers to freeze the DNA translation process. Image: Shu-Bing Qian

(—Although scientists sequenced the entire human genome more than 10 years ago, much work remains to understand what proteins all those genes code for.

Now, a study published online Aug. 27 in the describes a new approach that allows researchers to decode the genome by understanding where genes begin to encode for polypeptides, long chains of amino acids that make up proteins.

"The key to decoding the genome is knowing exactly where the genes start to encode polypeptides," said Shu-Bing Qian, the paper's senior author and an assistant professor of at Cornell. "If we know where they start, then we can predict what proteins they produce based on the gene's sequence."

are composed of four nucleotides—adenosine (A), cytidine (C), guanosine (G) and thymidine (T)—but the codes are arranged by three consecutive . The problem is, depending on where one begins to read the code, a single segment of DNA can generate different .

The new approach takes advantage of ribosomes, the translation machinery that decodes (), which carries the coding information from the DNA and translates those codes into chains of , proteins' building blocks.

When translating mRNA, the ribosome at the start position has an empty space inside. Qian and colleagues used a special that fills in that empty space and freezes that . This allows the researchers to locate precisely where a gene starts to encode polypeptides. They then use that information to predict what proteins are produced from the sequence.

By using this method, the researchers found that the same mRNA can have multiple start sites that lead to production of different proteins.

"About 50 percent of mRNA has more than one start site," said Qian. In this way, a limited genome can have multiple possibilities, depending on where on the gene a start site occurs. For instance, if it occurs later in a gene's sequence, it can code for a shorter or totally different protein.

During transcription, mRNA substitutes uracil (U) for T found in DNA. "Traditionally, all the known translation start sites were AUG. But we found that other codons, such as CUG can also serve as a start site," Qian said. The finding will rewrite the conventional thinking about genes and where they start to encode, he added.

The results suggest that the entire complement of proteins that can be expressed by a single gene is much more diverse than previously thought. Also, predicting what proteins a gene can code for may be much more challenging because of this alternative decoding process.

The technique can also be used to examine the genome of viruses, which are known for hijacking a cell's translation machinery to create new viruses.

"Viruses often use this alternative translation to maximize the coding capacity of their limited genome sequence to generate viral proteins," Qian said. This method has the potential to discover new viral proteins, he added.

Sooncheol Lee, a former postdoctoral researcher, and Botao Liu, a graduate student, both in Qian's lab, are the paper's lead authors, and Ben Shen, a chemist at Scripps Research Institute, was a co-author.

The study was funded by the National Institutes of Health, Ellison Medical Foundation and the U.S. Department of Defense.

Explore further: How 'molecular machines' kick start gene activation revealed

Related Stories

Scientists discover new genetic sub-code

April 16, 2010

In a multidisciplinary approach, Professor Yves Barral, from the Biology Department at ETH Zurich and the computer scientists Dr. Gina Cannarozzi and Professor Gaston Gonnet, from the Computer Science Department of ETH Zurich ...

A lack of structure facilitates protein synthesis

June 28, 2011

Having an easily accessible starting point on messenger RNA increases protein formation, scientists from the Max Planck Institute of Molecular Plant Physiology in Potsdam have discovered.

New mechanism in the regulation of human genes

July 14, 2011

Scientists at the Technical University of Munich and the Helmholtz Zentrum Muenchen and along with their colleagues from the European Molecular Biology Laboratory (EMBL) in Heidelberg and the Centre for Genomic Regulation ...

Researchers piece together how proteins fold

July 26, 2012

( -- A new method for looking at how proteins fold inside mammal cells could one day lead to better flu vaccines, among other practical applications, say Cornell researchers.

Recommended for you

How cells in the developing ear 'practice' hearing

November 25, 2015

Before the fluid of the middle ear drains and sound waves penetrate for the first time, the inner ear cells of newborn rodents practice for their big debut. Researchers at Johns Hopkins report they have figured out the molecular ...

How cells 'climb' to build fruit fly tracheas

November 25, 2015

Fruit fly windpipes are much more like human blood vessels than the entryway to human lungs. To create that intricate network, fly embryonic cells must sprout "fingers" and crawl into place. Now researchers at The Johns Hopkins ...

Study suggests fish can experience 'emotional fever'

November 25, 2015

(—A small team of researchers from the U.K. and Spain has found via lab study that at least one type of fish is capable of experiencing 'emotional fever,' which suggests it may qualify as a sentient being. In their ...


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.