Machine learning may boost protein production for better pharmaceuticals

December 22, 2017, Teesside University

A machine learning program developed by an international team of researchers may help pharmaceutical companies produce higher quantities of cutting-edge drugs needed for medical treatments.

In a study, the team developed a computer algorithm using of Chinese hamster ovary cells—a cell line often used by biopharmaceutical researchers for medical research—to optimize the production of proteins in those cells.

"The pharmaceutical industry typically relies on ovary cells of a Chinese hamster—CHO cells—for research to create effective drugs, but, because the cells do not produce much protein per cell, it requires large-scale production," said Claudio Angione, senior lecturer in computer science, Teesside University. "What we show is that, compared to other methods, combining this metabolic modelling with data-driven methods could be a vast improvement to the automation of cultures design, by accurately identifying optimal growth conditions for producing target therapeutic compounds."

The researchers, who reported their findings at the Second International Electronic Conference on Metabolomics, combined and a computational model that reconstructs the metabolism of the Chinese hamster ovary cells—CHO—to maximize the cell's efficiency.

"This is a novel step because, for the first time, we are combining two methodologies usually used individually in bioprocessing studies," said Angione.

The researchers were able to predict the production of lactate—a toxic waste product—inside the cells, in terms of both their genetic and metabolic states.

"Production of lactate is generally undesired as it hinders cell growth and consequently limits the yield of desired products," said Macauley Coggins, research assistant, Teesside University. "By predicting the cellular conditions where lactate accumulation is minimized it is possible to reduce—or possibly avoid—long series of experimental trials."

Therapeutic proteins, like the ones produced in CHO , have a wide range of applications in medicine.

"Some of them are used in vaccines and protect against infectious agents such as viruses," added Guido Zampieri, a doctoral student in genomics and bioinformatics, CRIBI Biotechnology Center, University of Padua. "Other proteins with special targeting activity can be used to treat patients that lack those proteins due to genetic conditions. Anticancer drugs are another example."

Machine learning is a field that explores how computers can learn how to solve problems and undertake specific tasks without being programmed, according to Coggins. To do this, researchers usually develop an algorithm to train a computer to recognize patterns, a machine learning technique often referred to as supervised learning.

"It's a lot like how you teach a child to recognize different shapes by showing them what each shape is and what it looks like"

In the future, this method could be used to optimize other metabolites or proteins, the researchers suggest. Producing higher quantities of drugs could also lead to less expensive treatments.

"We see several interesting research directions," said Angione. "Primarily, we aim at pushing forward the integration of different computational methodologies such as machine learning and biological modelling. This is important as they possess different strong points, which if combined could allow adopting more precise bioengineering interventions.

Particularly, machine learning can extract useful knowledge from experimental data, while metabolic modelling provides insights about local and global mechanisms in biochemical networks.

"We also want to explore other bioengineering steps that could benefit from this integrated optimization. The final goal is to obtain a set of computational tools that can guide industrial processes across multiple levels."

The researchers used data from a publicly available large-scale gene expression dataset from two different CHO cell lines with 295 microarray profiles with expression values for 3592 genes from 121 CHO cell cultures. For genome reconstruction, the researchers used a recently developed genome-scale metabolic model—GSMM—used to accurately predict growth phenotypes. The model is currently the largest reconstruction of CHO metabolism.

They then combined the model of CHO cell metabolism with the gene expression data to create condition and cell line-specific polyomics models.

Explore further: New computational model provides a tool for improving the production of valuable drugs

Related Stories

New AI method keeps data private

December 20, 2017

Modern AI is based on machine learning which creates models by learning from data. Data used in many applications such as health and human behaviour is private and needs protection. New privacy-aware machine learning methods ...

Better cell factories for the drugs of the future

June 23, 2017

Pharmaceuticals based on proteins are promising candidates for the treatment of cancer and other severe diseases, but they can be hard to produce. In a new research project, Chalmers researchers will develop new genetically ...

Model predicts how E. coli bacteria adapt under stress

October 13, 2017

Researchers at the University of California San Diego have developed a genome-scale model that can accurately predict how E. coli bacteria respond to temperature changes and genetic mutations. The work is aimed at providing ...

Recommended for you

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.