April 19, 2019

A universal framework combining genome annotation and undergraduate education

by Serena Stern, Boyce Thompson Institute

As genome sequencing becomes cheaper and faster, resulting in an exponential increase in data, the need for efficiency in predicting gene function is growing, as is the need to train the next generation of scientists in bioinformatics. Researchers in the lab of Lukas Mueller, a faculty member of the Boyce Thompson Institute (BTI), have developed a strategy to fulfill both of these needs, benefiting students and researchers in the process.

The Mueller Lab created a framework using the tremendous influx of new genome sequences as a training resource for undergraduates interested in learning genome annotation. This framework was published online in PLOS Computational Biology on April 3, 2019.

What is genome annotation, and why is it important?

After researchers determine the sequence of the millions of base pairs of DNA in an organism's genome, they need to figure out two things: which DNA segments are genes that encode proteins, and what are those proteins' functions. This process of identifying genes and predicting their functions is called genome annotation.

"The prediction of genes and their functions is what most biologists are interested in. That's where most understanding of biological processes is happening," says Prashant Hosmani, a Bioinformatics Analyst in the Mueller Lab and first author on the paper.

A genome is annotated by comparing its sequence to gene sequences from other related organisms. The most accurate method of genome annotation is manual curation, where a person does the analysis. In contrast, utilizing a computer program to identify genes and their functions is faster but is sometimes less accurate.

"Manual annotation is very time-intensive and thus expensive," said Surya Saha, Senior Bioinformatics Analyst in the Mueller Lab and the project coordinator. "The trick is to do both: first use automatic annotation, and then focus on genes and biochemical pathways of interest and annotate them manually."

The paper outlines a set of logical steps to begin an undergraduate annotation program from the ground up. When students first join the project, they are trained by team leaders and expert annotators on the tools of the trade.

Throughout the project, students keep careful notes of their research and results, ultimately compiling them into a report about the biochemical pathway of interest and the member gene families, which may be published. Indeed, this method has been used to generate a peer-reviewed publication with more than 20 undergraduate authors.

"Working is one thing, and receiving an acknowledgement for that work is also really important," says Hosmani. "That acts as a real motivation for the students."

Other student benefits include working with international collaborators, networking, practicing communication and peer review skills, and gaining valuable insights into career options. Undergraduates may also receive research or capstone project credits for their work, which increases their commitment to the project. More and more science-based graduate programs also require knowledge of bioinformatics, so these skills will prove valuable in many fields.

In the end, the researchers gain high-quality genome annotations for any species—not just plants—which offer a better understanding of how the organism functions, ultimately benefitting society in many fields, such as agriculture, biofuels and medicine.

The authors hope that other institutions will adapt and build upon this framework, no matter their size, access to resources or annotation goals. To make the framework easy to use, the authors designed their figures and tables to be stand-alone and printer-friendly for easy reference.

"Anybody who has a research problem, a sequenced genome, and interested students can implement a system by building off our workflow," said Saha.

More information: Prashant S. Hosmani et al, A quick guide for student-driven community genome annotation, PLOS Computational Biology (2019). DOI: 10.1371/journal.pcbi.1006682

Journal information: PLoS Computational Biology

Provided by Boyce Thompson Institute

Citation: A universal framework combining genome annotation and undergraduate education (2019, April 19) retrieved 2 July 2024 from https://phys.org/news/2019-04-universal-framework-combining-genome-annotation.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Scientists study genes misidentified as 'non-protein coding'

159 shares

Feedback to editors

A universal framework combining genome annotation and undergraduate education

What is genome annotation, and why is it important?

The demonstration of vacuum levitation and motion control on an optical-electrostatic chip

True scale of carbon impact from long-distance travel revealed

Aboriginal ritual passed down over 12,000 years, cave find shows

Increased atmospheric moisture may dampen the 'seeds' of hurricanes

Researchers train sheep to complete awake MRI imaging

Research intern helps discover a new pulsar buried in a mountain of data

Genetic patterns of world's farmed, domesticated foxes revealed via historical deep-dive

Study finds one-third of Indonesia's deforested land left idle

Microscopic fungi enhance soil carbon storage in new landscapes created by shrinking Arctic glaciers

Rethinking old reaction mechanisms to obtain drug-type molecules

Relevant PhysicsForums posts

Who chooses official designations for individual dolphins, such as FB15, F153, F286?

Color Recognition: What we see vs animals with a larger color range

Innovative ideas and technologies to help folks with disabilities

Is meat broth really nutritious?

COVID Virus Lives Longer with Higher CO2 In the Air

Periodical Cicada Life Cycle

Scientists study genes misidentified as 'non-protein coding'

A new released Chinese soybean genome facilitates soybean elite cultivar improvement

Team reduces the size of the human genome to 19,000 genes

Uncovering a reversible master switch for development

Plant protein structure database will help to uncover unknown functions of plant genes

Program turns up 150 missed genes

Targeted protein degradation: New adapter molecule expands potential of cell's waste disposal system

Giant clams may hold the answers to making solar energy more efficient

New workflow reveals composition and function of metabolic enzyme polymers

Waves of protein expression and phosphorylation rewire the yeast proteome during meiosis

Researchers thwart resistant bacteria's strategy

Scientists show ribosomes play an unexpected role in blood vessel formation

Medical Xpress

Tech Xplore

Science X

A universal framework combining genome annotation and undergraduate education

What is genome annotation, and why is it important?

The demonstration of vacuum levitation and motion control on an optical-electrostatic chip

True scale of carbon impact from long-distance travel revealed

Aboriginal ritual passed down over 12,000 years, cave find shows

Increased atmospheric moisture may dampen the 'seeds' of hurricanes

Researchers train sheep to complete awake MRI imaging

Research intern helps discover a new pulsar buried in a mountain of data

Genetic patterns of world's farmed, domesticated foxes revealed via historical deep-dive

Study finds one-third of Indonesia's deforested land left idle

Microscopic fungi enhance soil carbon storage in new landscapes created by shrinking Arctic glaciers

Rethinking old reaction mechanisms to obtain drug-type molecules

Relevant PhysicsForums posts

Related Stories

Scientists study genes misidentified as 'non-protein coding'

A new released Chinese soybean genome facilitates soybean elite cultivar improvement

Team reduces the size of the human genome to 19,000 genes

Uncovering a reversible master switch for development

Plant protein structure database will help to uncover unknown functions of plant genes

Program turns up 150 missed genes

Recommended for you

Targeted protein degradation: New adapter molecule expands potential of cell's waste disposal system

Giant clams may hold the answers to making solar energy more efficient

New workflow reveals composition and function of metabolic enzyme polymers

Waves of protein expression and phosphorylation rewire the yeast proteome during meiosis

Researchers thwart resistant bacteria's strategy

Scientists show ribosomes play an unexpected role in blood vessel formation

Newsletter sign up

Donate and enjoy an ad-free experience