Just as the name implies, complex systems are difficult to tease apart. An organism's genome, a biochemical reaction, or even a social network all contain many interdependent components—and changing any one of them can have pervasive effects on all the others. In the case of a very large system, like the human genome, which contains 20,000 interconnected genes, it's impossible to monitor the whole system at once.
But that may not matter anymore. In a paper published in the prestigious multidisciplinary journal Proceedings of the National Academy of Science, Northeastern network scientific have developed an algorithm capable of identifying the subset of components—or nodes—that are necessary to reveal a complex system's overall nature.
The approach takes advantage of the interdependent nature of complexity to devise a method for observing systems that are otherwise beyond quantitative scrutiny.
"Connectedness is the essence of complex systems," said Albert-László Barabási, one of the paper's authors and a Distinguished Professor of Physics with joint appointments in biology and the College of Computer and Information Science. "Thanks to the links between components, information is distributed throughout a network. Hence I do not need to monitor everyone to have a full sense of what the system does."
Barabási's collaborators comprise Jean-Jacques Slotine of M.I.T. and Yang-Yu Liu, lead author and research associate professor in Northeastern's Center for Complex Network Research, for which Barabási is the founding director.
Using their novel approach, the researchers first identify all the mathematical equations that describe the system's dynamics. For example, in a biochemical reaction system, several smaller reactions between peripherally related molecules may collectively account for the final product. By looking at how the variables are affected by each of the reactions, the researchers can then draw a graphical map of the system. The nodes that form the foundation of the map reveal themselves as indispensible to understanding any other part of the whole.
"What surprised me," said Liu, "was that the necessary nodes are also sufficient in most cases." That is, the indispensible nodes can tell the whole story without drawing on any of the other components.
The metabolic system of any organism is a collection of hundreds of molecules involved in thousands of biochemical reactions. The new method, which combines expertise from control theory, graph theory, and network science, reduces large complex systems like this to a set of essential "sensor nodes."
In the case of metabolism, the researchers' algorithm could simplify the process of identifying biomarkers, which are molecules in the blood that tell clinicians whether an individual is healthy or sick. "Most of the current biomarkers were selected almost by chance," said Barabási. "Chemists and doctors found that they happen to work. Observability offers a rational way to choose biomarkers, if we know the system we need to monitor."
Explore further: New algorithm can separate unstructured text into topics with high accuracy and reproducibility
More information: www.pnas.org/content/110/7/2460