When one reference genome is not enough

December 20, 2017, DOE/Joint Genome Institute
A single reference genome is not enough to harness the full genetic variation of a species so pan-genomes of crops would be extremely useful. The phenotypic diversity of Brachypodium plants is demonstrated in this image, which is associated with a news release for a Nature Communications paper in which an international team led by DOE Joint Genome Institute researchers gauged the size of a plant pan-genome using the model grass Brachypodium distachyon. Credit: John Vogel

Much of the research in the field of plant functional genomics to date has relied on approaches based on single reference genomes. But by itself, a single reference genome does not capture the full genetic variability of a species. A pan-genome, the non-redundant union of all the sets of genes found in individuals of a species, is a valuable resource for unlocking natural diversity. However, the computational resources required to produce a large number of high quality genome assemblies has been a limiting factor in creating plant pan-genomes.

Having plant pan-genomes for crops that are important for fuel and food applications would enable breeders to harness natural diversity to improve traits such as yield, disease resistance, and tolerance of marginal growing conditions. In a paper published December 19, 2017 in Nature Communications, an international team led by researchers at the U.S. Department of Energy (DOE) Joint Genome Institute (JGI), a DOE Office of Science User Facility at Lawrence Berkeley National Laboratory (Berkeley Lab), gauged the size of a plant pan-genome using Brachypodium distachyon, a wild grass widely used as a model for grain and biomass crops. As one of the JGI's Plant Flagship Genomes, B. distachyon ranks among the most complete plant reference genomes.

"There are a vast number of genes that are not captured in a single reference genome," added study senior author John Vogel, head of the JGI's Plant Functional Genomics group. "Indeed, about half of the genes in the pan-genome are found in a variable number of lines." Working toward the primary goal of accurately estimating the size of a plant pan-genome, Vogel and his colleagues performed whole-genome de novo assembly and annotation of 54 geographically diverse lines of B. distachyon, yielding a pan-genome containing nearly twice the number of genes found in any individual line.

"The genome of a species is a collection of genomes, each with their own unique twist," added JGI bioinformaticist and study first author Sean Gordon. "Now knowing that focusing on a single reference genome leads to incomplete and biased estimates of genetic diversity and ignores genes potentially important for breeding applications, we should better incorporate multiple references in future studies of ."

Moreover, genes found in only some lines tend to contribute to biological processes (e.g., disease resistance, development) that may be beneficial under some environmental conditions, whereas genes found in every line usually underpin essential cellular processes (e.g., glycolysis, iron transport).

"This means that the variable genes are being preferentially retained if they are beneficial under some conditions. These are exactly the types of genes that breeders need to improve crops." Vogel said.

In addition, genes found in only a subset of lines displayed faster rates of evolution, lay closer to transposable elements (thought to play a key role in pan- evolution), and were less likely to be found in the same chromosomal location as functionally equivalent in other grasses.

The sequence assemblies, gene annotations and related information can be downloaded from the project website BrachyPan: brachypan.jgi.doe.gov. The Brachypodium distachyon genome is available on the JGI Plant Portal Phytozome: phytozome.jgi.doe.gov.

Explore further: Unannotated genes identified through sequencing multiple lines of brachypodium distachyon

More information: Sean P. Gordon et al, Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure, Nature Communications (2017). DOI: 10.1038/s41467-017-02292-8

Related Stories

Genome of wheat ancestor sequenced

November 15, 2017

Sequencing the bread wheat genome has long been considered an almost insurmountable task, due to its enormous size and complexity. Yet it is vitally important for the global food supply, providing more than 20 percent of ...

A model grass gets its genomic profile

December 13, 2013

The grass species known as the purple false brome, Brachypodium distachyon, has great potential as a model plant for research due to its short generation time, small size, small genome and ease of breeding. These features ...

Tracking down the jumping genes of maize

August 24, 2017

The "jumping genes" of maize have finally been mapped by an international team led by researchers at the University of California, Davis, and the Cold Spring Harbor Laboratory. The discovery could ultimately benefit the breeding ...

Recommended for you

Research offers new insights into malaria parasite

May 18, 2018

A team of researchers led by a University of California, Riverside, scientist has found that various stages of the development of human malaria parasites, including stages involved in malaria transmission, are linked to epigenetic ...

What we've learned about the nucleolus since you left school

May 17, 2018

The size of a cell's nucleolus may reveal how long that cell, or even the organism that cell belongs to, will live. Over the past few years, researchers have been piecing together an unexpected link between aging and an organelle ...

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.