IBM Discovery Could Shed New Light on Workings of the Human Genome

Apr 25, 2006
IBM Discovery Could Shed New Light on Workings of the Human Genome
This is an example of several tens of "pyknons" that are shared in various arrangements and combinations by the sequences of the 3´ untranslated regions (3´ UTRs) of 10 distinct human transcripts. These pyknons come from a set of 127,998 pyknons that are present in the intergenic/intronic regions of the human genome and have additional instances in the untranslated and protein-coding regions of 30,675 transcripts from 20,059 human genes. A 3´ UTR is part of a gene´s transcript that is not translated into an amino acid sequence. Source: The Journal Proceedings of the National Academy of Sciences. Credit: IBM

IBM today announced its researchers have discovered numerous DNA patterns shared by areas of the human genome that were thought to have little or no influence on its function and those areas that do.

As reported today in the Proceedings of the National Academy of Sciences (PNAS), regions of the human genome that were assumed to largely contain evolutionary leftovers (called “junk DNA”) may actually hold significant clues that can add to scientists’ understanding of cellular processes. IBM researchers have discovered that these regions contain numerous, short DNA “motifs,” or repeating sequence fragments, which also are present in the parts of the genome that give rise to proteins.

If verified experimentally, the discovery suggests a potential connection between these coding and non-coding parts of the human genome that could have a profound impact on genomic research and provide important insights on the workings of cells.

“Our goal is to apply advanced computational techniques to analyze the workings of processes and systems, in this case the function of the human genome,” said Ajay Royyuru, head of the Computational Biology Center at IBM Research. “Using these tools, we’ve been able to shed new light on parts of the DNA that were traditionally thought of as not having a specific purpose. We believe the innovative application of technology can provide further understanding in the life sciences at large.”

The IBM team used a mathematical tool called pattern-discovery, often applied to mine useful information from very large repositories of data in both business and scientific applications, to sift through the approximately six billion letters in the non-coding regions of the human genome and look for repeating sequence fragments, or motifs.

Among the millions of discovered motifs, the team identified approximately 128,000 that also occur in the coding region of the genome and are significantly over-represented in genes involved in specific biological processes such as cell communication, regulation of transcription, transport and others. In fact, copies of one or more of these motifs can be found in over 90 percent of all known human gene sequences, as well as some genes of other animals where they associate with similar biological processes.

The report on this work “Short blocks from the non-coding parts of the human genome have instances within nearly all known genes and relate to biological processes” by Isidore Rigoutsos, Tien Huynh, Kevin Miranda, Aristotelis Tsirigos, Alice McHardy and Daniel Platt of IBM’s T. J. Watson Research Center, Yorktown Heights, NY appeared on April 24th in the early edition of the journal PNAS.

Source: IBM

Explore further: Former Brown dean whose group won Nobel Prize dies

add to favorites email to friend print save as pdf

Related Stories

Crowdsourced power to solve microbe mysteries

Oct 22, 2014

University of New South Wales scientists hope to unlock the secrets of millions of marine microbes from waters as far apart as Sydney's Botany Bay and the Amazon River in Brazil, with the help of an international ...

Designing exascale computers

Jul 23, 2014

"Imagine a heart surgeon operating to repair a blocked coronary artery. Someday soon, the surgeon might run a detailed computer simulation of blood flowing through the patient's arteries, showing how millions ...

Amino acid fingerprints revealed in new study

Apr 06, 2014

Some three billion base pairs make up the human genome—the floor plan of life. In 2003, the Human Genome Project announced the successful decryption of this code, a tour de force that continues to supply ...

Illinois initiative creates futuristic facility

Dec 02, 2013

Through the CompGen initiative, the University's Institute for Genomic Biology and the Coordinated Science Laboratory in the College of Engineering are bringing together top faculty in genomic and computational ...

Recommended for you

Former Brown dean whose group won Nobel Prize dies

9 hours ago

David Greer, a doctor who co-founded a group that won the 1985 Nobel Peace Prize for working to prevent nuclear war and who helped transform the medical school at Brown University, has died. He was 89.

Revealing political partisanship a bad idea on resumes

14 hours ago

Displaced political aides looking for a new, nonpartisan job in the wake of the midterm power shuffle may fare better if they tone down any political references on their resumes, finds a new study from Duke University.

Laser from plane discovers Roman goldmines in Spain

14 hours ago

Las Médulas in León is considered to be the largest opencast goldmine of the Roman Empire, but the search for this metal extended many kilometres further south-east to the Erica river valley. Thanks to ...

User comments : 0

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.