Combining machine learning with Bayesian inference and metabolic modeling helps design new yeast capable of producing biofuels. Credit: Nathan Johnson | Pacific Northwest National Laboratory

The science is clear: fossil fuels are harmful to the environment. So why is it so difficult for us to stop using them? Economic reasons are at least part of the answer. From our energy grid to the manufacturing of certain textiles and other products, many parts of our society are built to use fossil fuels. Transitioning away will come at some cost.

But what if we could produce an economically attractive replacement for fossil fuels? New research from Pacific Northwest National Laboratory (PNNL) suggests a way to do just that. Biologists have devised a way to engineer to produce itaconic —a valuable commodity chemical—using data integration and supercomputing power as a guide.

Creating microbial factories using metabolic modeling

Itaconic acid has as a renewable chemical building block. It could substitute for some fossil-fuel-derived products. In 2004, it was named one of the "top value added chemicals from biomass" in a report by the Department of Energy (DOE). Seeing the potential of itaconic acid as a petrochemical replacement, data scientist Neeraj Kumar set out to inexpensively produce it using microbes.

Kumar and colleagues had previously developed a way to calculate how engineered changes in microbes could affect their metabolism. Building upon this idea, Kumar wanted to see if he could use these metabolic predictions to engineer yeast to produce high amounts of itaconic acid.

"We needed to identify what genes in the itaconic acid production pathway we could alter so the yeast could make greater quantities of the chemical," said Kumar. "The challenge was finding the balance between cellular health and bioproduction."

Biologist Erin Bredeweg shows off different cultures of the Yarrowia lipolytica yeast. Credit: Andrea Starr | Pacific Northwest National Laboratory

Design-build-test-learn

Itaconic acid is naturally produced by just a few fungi. PNNL scientist Ziyu Dai borrowed genes from other fungi to give Yarrowia lipolytica the ability to produce the . Biologist Erin Bredeweg had been working on this modified yeast, containing several different gene combinations, when Kumar approached her to collaborate. Bredeweg and her colleagues had created a metabolic and proteomic profile of the modified yeast and passed the data to Kumar.

Taking cues from the Design-Build-Test-Learn strategy, Kumar and his research associate Andrew McNaughton used to examine this profile to see what nonessential genes could be removed from the yeast, or what helpful ones could be added, to increase the production of itaconic acid.

Once they selected the genes to "design" the organism, it was time to build. Bredeweg created different versions of the yeast with added or removed based on Kumar and McNaughton's computational predictions. She then tested the different yeasts to see if carbon flow toward itaconic acid production pathways was affected. Machine learning analysis of the data from RNA sequencing indicated that the computational predictions matched the experimental outcome and further detailed gene predictions for future analysis.

"Though this research is still in the early stages, it is exciting to see its potential," said Bredeweg. "Machine learning and causal inference can uncover new ways of thinking about how a complex cell system, like yeast, could respond to individual gene changes, beyond what is possible from metabolic modeling alone."

Machine learning and multiomics datasets expand the potential of metabolic modeling

Yeasts and other microbes are commonly used to produce useful chemicals. While it is easy to get them to produce some chemicals in high yields, like ethanol, other chemicals may provide more of a challenge. Kumar hopes that this system of combining machine learning with metabolic modeling and multiomics datasets will help overcome these production challenges.

"Though we still need more testing on this model, there is an amazing potential to expand this computationally guided bioengineering to other systems," said Kumar. "This strategy could open up a new era in biosystem design for the production of eco-friendly chemicals."

More information: Andrew D. McNaughton et al, Bayesian Inference for Integrating Yarrowia lipolytica Multiomics Datasets with Metabolic Modeling, ACS Synthetic Biology (2021). DOI: 10.1021/acssynbio.1c00267

Journal information: ACS Synthetic Biology