Solving puzzles without a picture: New algorithm assembles chromosomes from next generation sequencing data

January 10, 2013

One of the most difficult problems in the field of genomics is assembling relatively short "reads" of DNA into complete chromosomes. In a new paper published in Proceedings of the National Academy of Sciences an interdisciplinary group of genome and computer scientists has solved this problem, creating an algorithm that can rapidly create "virtual chromosomes" with no prior information about how the genome is organized.

The powerful developed about 15 years ago, known as next generation sequencing (NGS) technologies, create thousands of short fragments. In species whose genetics has already been extensively studied, existing information can be used to organize and order the NGS fragments, rather like using a sketch of the complete picture as a guide to a . But as scientists push into less-studied species, it becomes more difficult to finish the puzzle.

To solve this problem, a team led by Harris Lewin, distinguished professor of evolution and ecology and vice chancellor for research at the University of California, Davis and Jian Ma, assistant professor at the University of Illinois at Urbana-Champaign created a that uses the known chromosome organization of one or more known species and NGS information from a newly sequenced genome to create virtual chromosomes.

"We show for the first time that chromosomes can be assembled from NGS data without the aid of a preexisting genetic or physical map of the genome," Lewin said.

The new algorithm will be very useful for large-scale sequencing projects such as G10K, an effort to sequence 10,000 vertebrate genomes of which very few have a map, Lewin said.

"As we have shown previously, there is much to learn about phenotypic evolution from understanding how are organized in one species relative to other species," he said.

The algorithm is called RACA (for reference-assisted chromosome assembly), co-developed by Jaebum Kim, now at Konkuk University, South Korea, and Denis Larkin of Aberystwyth University, Wales. Kim wrote the software tool which was evaluated using simulated data, standardized reference genome datasets as well as a primary NGS assembly of the newly sequenced Tibetan antelope genome generated by BGI (Shenzhen, China) in collaboration with Professor Ri-Li Ge at Qinghai University, China. Larkin led the experimental validation, in collaboration with scientists at BGI, proving that predictions of chromosome organization were highly accurate.

Ma said that the new RACA algorithm will perform even better as developing NGS technologies produce longer reads of DNA sequence.

"Even with what is expected from the newest generation of sequencers, complete chromosome assemblies will always be a difficult technical issue, especially for complex genomes. RACA predictions address this problem and can be incorporated into current NGS assembly pipelines," Ma said.

Explore further: New cow genome sequence released

Related Stories

New cow genome sequence released

April 23, 2009

Scientists from the University of Maryland have published their assembly of the domestic cow (Bos taurus), an important new resource for the genetics community. The new version of the cow genome improves considerably on other ...

Making sense of molecular fragments

June 29, 2012

(Phys.org) -- Data from high-throughput next generation sequencers (NGS) and genome tiling arrays have greatly enhanced scientists’ ability to recreate RNA molecular structures, which is vital to disease and biotechnology ...

Recommended for you

Silencing cholera's social media

May 24, 2016

Bacteria use a form of "social media" communication called quorum sensing to monitor how many of their fellow species are in the neighborhood, allowing them to detect changes in density and respond with changes in collective ...

A 100-million-year partnership on the brink of extinction

May 24, 2016

A relationship that has lasted for 100 million years is at serious risk of ending, due to the effects of environmental and climate change. A species of spiny crayfish native to Australia and the tiny flatworms that depend ...

Evolution influenced by temporary microbes

May 24, 2016

Life on Earth often depends on symbiotic relationships between microbes and other forms of life. A new theory suggests that researchers should consider how symbiotic microbes can influence the evolution of life on Earth, ...

Great apes communicate cooperatively

May 24, 2016

Human language is a fundamentally cooperative enterprise, embodying fast-paced interactions. It has been suggested that it evolved as part of a larger adaptation of humans' unique forms of cooperation. In a cross-species ...

Rare evolutionary event detected in the lab

May 23, 2016

It took nearly a half trillion tries before researchers at The University of Texas at Austin witnessed a rare event and perhaps solved an evolutionary puzzle about how introns, non-coding sequences of DNA located within genes, ...

In changing oceans, cephalopods are booming

May 23, 2016

Humans have changed the world's oceans in ways that have been devastating to many marine species. But, according to new evidence, it appears that the change has so far been good for cephalopods, the group including octopuses, ...

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.