share this!
4
4
Share
Email

February 24, 2023

PanGu drug model: Learn a molecule like a human

A recent study published in the journal Science China Life Sciences was led by Dr. Nan Qiao (Laboratory of Health Intelligence, Huawei Cloud Computing Technologies), Dr. Hualiang Jiang (Shanghai Institute of Materia Medica, Chinese Academy of Sciences) and Dr. Mingyue Zheng (Shanghai Institute of Materia Medica, Chinese Academy of Sciences).

"Over the past year, the parameter size of the language model has continued to grow, exceeding 175 billion GPT3s. Recently, ChatGPT, a new-generation language model, interacts with users in a more real-life way, such as answering questions, admitting mistakes, questioning incorrect questions or rejecting inappropriate requests, and is even thought to subvert search engines," Dr. Qiao says.

In addition to language models, areas such as image, video and multimodality were refreshed by transformer architectures these years at the same time. These large models usually use self-supervised learning, which can greatly reduce the workload and achieve better performance in long tail tasks. However, in the AI for drug discovery field, there has been no really big model to accelerate drug research and development and improve the efficiency.

Xinyuan Lin and Zhaoping Xiong, together with lab director Nan Qiao, sought to build a big model for drug discovery that can be used for drug discovery tasks such as molecular property prediction, molecular generation and optimization. The team proposes a novel graph-to-sequence (graph2seq) asymmetric structure, which is different from the classical sequence-to-sequence (seq2seq) and graph-to-graph (graph2graph) variational auto-encoding processes.

The model is pre-trained for 1.7 billion druglike molecules (currently the largest), the input is a two-dimensional undirected cyclic graph of drug-like molecules, and the output is the corresponding chemical formula or SMILES string. Humans read images of chemical structures and write down the text of the corresponding formulas, so after billions of repetitions, Pangu can learn the relationship between chemical structures and formula strings, similar to human cognitive transformations.

After pre-training with 1.7 billion druglike small molecules, the model achieved state-of-the-art results in 20 drug discovery tasks, including molecular property prediction. (predicting ADMET properties, compound-protein interactions, drug-drug interactions, and chemical reaction yields) , molecular generation and molecular optimization.

The Pangu Molecular Generator has also generated a new drug screening library of 100 million drug-like small molecules with a novelty of 99.68%, which can also effectively generate new compounds with similar physicochemical properties to a given distribution. This library can be used to supplement the existing compound database. In addition, the Pangu Molecular Optimizer can optimize the chemical structure of the starting molecule and improve the characteristics of the molecule of interest.

More information: Xinyuan Lin et al, PanGu Drug Model: learn a molecule like a human, Science China Life Sciences (2022). DOI: 10.1007/s11427-022-2239-y

Provided by Science China Press

Citation: PanGu drug model: Learn a molecule like a human (2023, February 24) retrieved 6 August 2024 from https://phys.org/news/2023-02-pangu-drug-molecule-human.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Math approach may make drug discovery more effective, efficient

8 shares

Feedback to editors

PanGu drug model: Learn a molecule like a human

Study finds seasonal shifts in moral values

Researchers reveal atomic-scale details of catalysts' active sites

Sniff test for explosives detection extends its reach

Researchers dig deeper into stability challenges of nuclear fusion—with mayonnaise

New X-ray world record: Looking inside a microchip with 4 nanometer precision

Groundwater reserves in southwestern Europe more stable overall than previously thought

Competition over millions of years preserves genetic diversity of three crustaceans

Researchers discover optimum twilight time for plant growth

Patents can help researchers understand wildlife trade trends, new study shows

New technology protects crops by testing the air for the DNA of plant diseases

Relevant PhysicsForums posts

Where can I find chemistry experiments that are accurately described with the Dirac equation?

When do cyanide ligands form a tetrahedral geometry with Co2 ?

Why are metals in aqueous solution basic, and non-metals acidic?

Reaction Rate experiment: Potassium Permanganate and Hydrogen Peroxide

Why does the energy level re-order in that way in case b and case c?

d-orbital self split or proton split d-orbitals?

Math approach may make drug discovery more effective, efficient

The world's largest quantum chemistry dataset to empower new materials design and drug discovery

Using 'counterfactuals' to verify predictions of drug safety

AI technique narrowed to only propose candidate molecules that can be produced in a lab

From chemical graphs to structures

Machine-learning method creates a learnable chemical grammar to build synthesizable monomers and polymers

Engineers develop general, high-speed technology to model, understand catalytic reactions

Self-powered pump uses light and chemistry to remove water pollutants

Machine learning discovers 'hidden-gem' materials for heat-free gas separation

Uniquely precise: New value for the half-life of samarium-146

Probing carbon capture, atom-by-atom with machine-learning model

Breakthrough in Z-alkene synthesis: Scientists develop efficient and sustainable method

Medical Xpress

Tech Xplore

Science X

PanGu drug model: Learn a molecule like a human

Study finds seasonal shifts in moral values

Researchers reveal atomic-scale details of catalysts' active sites

Sniff test for explosives detection extends its reach

Researchers dig deeper into stability challenges of nuclear fusion—with mayonnaise

New X-ray world record: Looking inside a microchip with 4 nanometer precision

Groundwater reserves in southwestern Europe more stable overall than previously thought

Competition over millions of years preserves genetic diversity of three crustaceans

Researchers discover optimum twilight time for plant growth

Patents can help researchers understand wildlife trade trends, new study shows

New technology protects crops by testing the air for the DNA of plant diseases

Relevant PhysicsForums posts

Related Stories

Math approach may make drug discovery more effective, efficient

The world's largest quantum chemistry dataset to empower new materials design and drug discovery

Using 'counterfactuals' to verify predictions of drug safety

AI technique narrowed to only propose candidate molecules that can be produced in a lab

From chemical graphs to structures

Machine-learning method creates a learnable chemical grammar to build synthesizable monomers and polymers

Recommended for you

Engineers develop general, high-speed technology to model, understand catalytic reactions

Self-powered pump uses light and chemistry to remove water pollutants

Machine learning discovers 'hidden-gem' materials for heat-free gas separation

Uniquely precise: New value for the half-life of samarium-146

Probing carbon capture, atom-by-atom with machine-learning model

Breakthrough in Z-alkene synthesis: Scientists develop efficient and sustainable method

Newsletter sign up

Donate and enjoy an ad-free experience