Prediction of protein disorder from amino acid sequence

Credit: Unsplash/CC0 Public Domain

Structural disorder is vital for proteins' function in diverse biological processes. It is therefore highly desirable to be able to predict the degree of order and disorder from amino acid sequence. Researchers from Aarhus University have developed a prediction tool by using machine learning together with experimental NMR data for hundreds of proteins, which is envisaged to be useful for structural studies and understanding the biological role and regulation of proteins with disordered regions.

In the last century, Anfinsen showed beyond a doubt that a protein can find its way back to its 'native' three-dimensional structure after it has been placed under 'denaturing conditions' where the is unfolded. The profound conclusion of his experiments was that apparently the information that governs the search back to the native state is hidden in the amino acid sequence. Thermodynamic considerations then set forth a view where the folding process is like rolling energetically downhill to the lowest point—to the unique native structure. These findings have often been intertwined with the central dogma of molecular biology. Thus, a gene codes for an , and the sequence codes for a specific structure.

Enter intrinsically disordered proteins.

The next breakthrough came with the advent of cheap and fast genome sequencing in the wake of the human genome project; once thousands of genomes of various organisms were sequenced, scientists made a staggering discovery—there were lots and lots of genes that coded for proteins with low-complexity. In other words, these proteins did not contain the right amino acids to fold up and experiments confirmed that they remained 'intrinsically disordered.' Also, the human genome turned out to have more than a third of its genes coding for protein disorder!

How to detect protein disorder?

Since disordered proteins are very flexible, they are not amenable to crystallization and therefore no information can be obtained from X-ray diffraction on protein crystals—the approach that has been so pivotal for folded proteins. Instead, these proteins must be studied in solution, and for this purpose NMR (Nuclear Magnetic Resonance) spectroscopy is the most suited tool. In this method, a quantum physical property called 'spin' is measured in a strong magnetic field for each atom in the molecule. The exact precession frequencies of the spins are a function of their environment, and it is exactly this frequency that allows researchers to quantitatively measure to which extent each amino is ordered or disordered in the .

In their new paper in Scientific Reports, published on 8 Sept 2020, Dr. Rupashree Dass together with Associate Professor Frans Mulder and Assistant Professor Jakob Toudahl Nielsen have used machine learning together with experimental NMR data for hundreds of proteins to build a new bioinformatics tool that they have called ODiNPred. This bioinformatics program can help other researchers making the best possible predictions of which regions of their proteins are rigid and which are likely to be flexible. This information is useful for structural studies, as well as understanding the biological role and regulation of intrinsically disordered proteins.

More information: ODiNPred: comprehensive prediction of protein order and disorder, Scientific Reports, DOI: 10.1038/s41598-020-71716-1 ,

Journal information: Scientific Reports

Provided by Aarhus University

Citation: Prediction of protein disorder from amino acid sequence (2020, September 9) retrieved 9 December 2022 from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

New computational tool enables prediction of key functional sites in proteins based on structure


Feedback to editors