New software developed by researchers at the University of Illinois at Urbana-Champaign allows scientists to more effectively analyze and compare both sequence and structure data from a growing library of proteins and nucleic acids.
"MultiSeq (pronounced Multi-seek) allows you to bring in both structure and sequences without structure, and use the complementary information contained within them to investigate changes in the system," said Zaida Luthey-Schulten, a professor of chemistry and a researcher at the Beckman Institute for Advanced Science and Technology at the U. of I. "By placing bioinformatics in the context of evolution, we can also perform comparative dynamics studies of proteins from different domains of life."
Currently, more than 3 million sequences and 35 thousand structures of proteins and nucleic acids are available for study. By providing an environment for the evolutionary analysis of this data, the software can help scientists gain valuable insight into basic scientific questions, such as the origin of life, as well as questions of a more practical nature, such as the development of resistance to ribosome targeting antibiotics.
Developed by Luthey-Schulten and graduate students Elijah Roberts, John Eargle and Dan Wright, MultiSeq is a major extension of the Multiple Alignment tool that is provided as part of VMD (Visual Molecular Dynamics), a program for visualizing and analyzing molecular dynamics simulations. Developed at the U. of I. and distributed free of charge, VMD is designed to efficiently handle large three-dimensional systems containing more than a million atoms. MultiSeq extends VMD's capabilities by incorporating the more diverse evolutionary data available in sequences into the analysis process.
For example, the computational tools in MultiSeq may help scientists understand the evolution of ribosomes, the basic machinery of translation. Translation is a key component of all life, and the components of this cellular machinery are the biomolecules with the most linear line of descent.
"If we want to try and understand how translation has changed among the three domains of life, we have to at least be able to overlap and compare three ribosomes," Luthey-Schulten said. "Last year, we could not compare two ribosomes. Now, using MultiSeq, we can compare more than 20 ribosomes."
MultiSeq combines both sequence and structure data within an evolutionary framework using information science to organize and search the data, information visualization to assist in recognizing correlations, mathematics to formulate statistical inferences, and biology to analyze chemical and physical properties in terms of sequence and structure changes.
The researchers developed MultiSeq in collaboration with the Theoretical and Computational Biophysics group at the Beckman Institute, and with the NIH (National Institutes of Health) Resource for Macromolecular Modeling and Bioinformatics. They describe the software in a paper accepted for publication in the journal BMC Bioinformatics, and featured on the journal's Web site. The software is being used in classrooms this fall as a teaching tool for computational chemical biology.
"We believe the complexity present in biology can not be fully understood without using evolution as an underlying framework," the researchers write. "This approach can speed up research by revealing unproductive tasks in advance or by exposing new paths through the introduction of distant but related data."
For details on how to download and use the software, visit the MultiSeq website at: www.scs.uiuc.edu/~schulten/multiseq/
Source: University of Illinois at Urbana-Champaign
Explore further: A refined approach to proteins at low resolution