(Phys.org) —Genomics researchers of the University of Arizona's iPlant collaborative, housed in the BIO5 Institute, have helped unravel the genetic code of the rapeseed plant, most noted for a variety whose seeds are made into canola oil.
The findings will help breeders select for desirable traits such as richer oil content and faster seed production. Other potential applications include modifying the quality of canola oil, making it more nutritious and adapting the plants to grow in more arid regions.
In addition, they help scientists better understand how plant genomes evolve in the context of domestication. Brassica plants have been bred all over the world for centuries and resulted in produce and products diverse enough for supermarkets to place them across several different aisles.
Broccoli, cauliflower, Brussels sprouts, Chinese cabbage, turnip, collared greens, mustard, canola oil – all these are different incarnations of the same plant genus, Brassica.
"Whole-genome sequencing efforts like this one allow us to address two fundamental questions," said Eric Lyons, an assistant professor in the School of Plant Sciences at the University of Arizona College of Agriculture and Life Sciences, whose research team provides the software architecture for this and many other genome research projects. "How does the genetic information stored in the genome help us understand the functions of the organism, and what does the structure of the genome tell us about the evolution of genomes in general?"
The endeavor, which was led by institutions in France, Canada, China and U.S., revealed that the rapeseed (or Brassica napus) genome contains a large number of genes – more than 100,000 – due to the fact that it arose from a merger between two parent species, Brassica rapa (Chinese cabbage) and Brassica oleracea, a cultivar that includes broccoli, cauliflower, Brussels sprouts, collard greens and others.
The findings appear in the Aug. 22 issue of the journal Science and come at the heels of another international sequencing effort led by UA researchers, which revealed the complete genome of African Rice.
The computational power and cyberinfrastructure for running the analyses is provided by the iPlant Collaborative, a $100 million project funded by the National Science Foundation and headquartered in the UA's BIO5 Institute.
"The rapeseed genome has a very interesting history," said Haibao Tang, one of the leading authors of the study, who just joined the UA as a senior scientist for bioinformatics. "As a result of the merger event, it ended up with four copies of each gene. In this study, we looked at what happened after this merging event. For example, what genes were gained and what genes were lost."
"The Brassica group is extremely versatile with regard to human use," he said. "In all of the cultivars, we find something to eat. The genome defines what Brassicas are."
"It also defines what kids hate to eat," Lyons added. "The bitterness in some cultivars such as broccoli or Brussels sprouts comes from a class of compounds called glucosinolates, and we find that precisely those genes that code for those compounds were lost from the rapeseed genome."
The sequencing effort provides scientists and breeders with a map they can use to home in on certain genes and, by extension, the plant's metabolic pathways. For example, they could strive to create a cultivar of broccoli that's not bitter, or tweak the lipid biosynthesis pathway to favorably modify the oil content in rapeseed. Being able to modify the content of bitter-tasting compounds has implications beyond what meets the tongue, because in most plants, those chemicals also confer defense against pests.
"Depending on the cultivar in question, breeders may want to change the biochemistry," Lyons said. "You could knock down chemicals you don't want and ramp up others you do want. Or you may want to change the shape of the plant or parts of it. With Chinese cabbage, for example, we don't care too much about its oil content, but the size and shape of the leaves and how they taste. With rapeseed, it's the other way around."
The successful completion of the rapeseed genome sequence stems from a long-standing collaboration between Lyons and Tang, who comes to the UA from the J. Craig Venter Institute in Rockville, Maryland. Tang has specialized in writing the core algorithms of the Comparative Genomics's platform, CoGe, which is powered by iPlant's cyberinfrastructure, and provides a management layer for genomic data and tools to process them.
"Plugging into the infrastructure of iPlant allows us to scale far beyond of what we could do otherwise," Lyons said. "Each of these analyses takes hundreds of computing hours. In other words, either one computer working for hundreds of hours, or hundreds of computers working for one hour. With iPlant, we have access to a thousand or more computers to do this."
"We developed the tools people need to analyze large genomes, and now we can focus on discoveries," Tang said.
Lyons added: "Because we have been working to make these tools scalable, they are being used for virtually every genome analysis."
"Currently, CoGe and iPlant are being utilized to analyze 23,000 genomes from 17,000 organisms," Lyons said. "What started out as a plant genome platform has long expanded into all other areas of biology."
"We are currently involved with the genomes of birds, insects, bees, cows, fish, pig, horses and many plants," Lyons said. "The tool that we have developed for that past few years has become critical part of the ecosystem of bioinformatics tools that people regularly use."
"Leveraging iPlant we can empower scientists around the world to compare genomes among each other, and allow people to pick what they want to do, when they want it, and the way they want it," he added.
Explore further: Canola genome sequence reveals evolutionary 'love triangle'
More information: Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome Boulos Chalhoub Science 22 August 2014: Vol. 345 no. 6199 pp. 950-953. DOI: 10.1126/science.1253435