Algorithm speeds metabolic pathway analysis, spurring computational advances in network science
Scientists need a systematic understanding of cellular functioning to metabolically engineer microbes that produce biofuels and other high-value products, and to design drugs that combat pathogens and cancers. Metabolic networks are highly complex, however, so researchers computationally break them down into simplified metabolic pathways in order to analyze function. Think of the complex network of roads between New York City and Seattle. The simplified pathways help determine an optimum route.
But imagine confronting the entire United States map in search of a route, where every road and alternate route is an unmarked candidate pathway. Such computational complexity puts a strain on the current tools for analyzing complex metabolic networks.
A new paper written by Pacific Northwest National Laboratory computational biologist Hyun-Seob Song and three colleagues proposes a novel optimization approach that leads to a remarkable reduction in computation time.
Identifying All Possible Alternative Paths
For calculating metabolic pathways, their proposed alternate integer linear programming (AILP) reduces computation time by orders of magnitude compared to the commonly used mixed integer linear programming (MILP). It enables all possible alternative paths in a given network to be systematically identified, a key measure for evaluating the robustness of metabolic networks.
AILP also provides a new conceptual basis for analyzing other biological network systems, including biogeochemical reaction networks and interspecies interaction networks. Extended, the new algorithm could in the future be used to analyze complex networks of any kind, including electricity and water distribution systems.
Benefits to Bioengineering and Biomedicine
It is important to analyze metabolic pathways for bioengineering and biomedical applications. Such analyses, obtained by creating simpler pathway units, can be used to design new industrially useful microorganisms or to develop drugs that target specific metabolic reactions in pathogens or cancer cells.
However, computing such metabolic pathways from genome-scale metabolic networks all at one time is practically impossible. "The number in genome-scale metabolic networks is truly high - millions to billions," said lead author Song. "Therefore it is very challenging to calculate them simultaneously."
At present, scientists use algorithmic optimization methods to sequentially calculate simplified metabolic pathways through iteration. But the efficiency of mixed integer linear programing (MILP), the most common iterative method, quickly declines the more an iteration goes on.
That's because it has to simultaneously "solve" both the integer and linear programing of a given network - two optimization problems at one time. "As the number builds up," said Song, "the computational burden of MILP increases exponentially."
Splitting Computational Steps
Song and collaborators from the United States, India, and Israel propose a new method of sequential computation. Their alternate integer linear programming (AILP) splits MILP into two separate computational steps, integer programming and linear programming. That makes the problem of computing metabolic pathways from complex networks composed of thousands of reactions doable.
To demonstrate the efficiency of AILP compared to MILP, the researchers used genome-scale networks of Saccharomyces cerevisiae and E. coli.
The original motivation for the algorithm was to more efficiently identify metabolic pathways, which on the genome scale are explosively complicated. But Song envisions a day when variants of the AILP algorithm are used to analyze complex networks outside of biology, including transit systems, water delivery networks, and energy supply systems.
In what Song called a "nice bonus," the new algorithm also calculates - without any extra computational time - "minimal cut sets" within metabolic pathways. Those are the key points along a pathway that, if cut or blocked, stop the operation of the whole network. Identifying cut sets could provide researchers with metabolic target points, where a gene or protein could be deactivated in order to stop the growth of a cancer cell or a pathogen.
Song and his collaborators are now busy extending their method to nonlinear problems in order to apply AILP to complex network systems elsewhere in science and engineering.