May 2, 2022

Using 'counterfactuals' to verify predictions of drug safety

How can we be sure machine learning is accurate? : NewsCenter — Overview of MMACE. The input is a molecule to be predicted. Chemical space is expanded and clustered. Counterfactuals are selected from clusters to find succinct explanation of base molecule prediction. Credit: *Chemical Science* (2022). DOI: 10.1039/D1SC05259D

Scientists rely increasingly on models trained with machine learning to provide solutions to complex problems. But how do we know the solutions are trustworthy when the complex algorithms the models use are not easily interrogated or able to explain their decisions to humans?

That trust is especially crucial in drug discovery, for example, where machine learning is used to sort through millions of potentially toxic compounds to determine which might be safe candidates for pharmaceutical drugs.

"There have been some high-profile accidents in computer science where a model could predict things quite well, but the predictions weren't based on anything meaningful," says Andrew White associate professor of chemical engineering at the University of Rochester, in an interview with Chemistry World.

White and his lab have developed a new "counterfactual" method, described in Chemical Science, that can be used with any molecular structure-based machine learning model to better understand how the model arrived at a conclusion.

Counterfactuals can tell researchers "the smallest change to the features that would alter the prediction," says lead author Geemi Wellawatte, a Ph.D. student in White's lab. "In other words, a counterfactual is an example as close to the original, but with a different outcome."

Counterfactuals can help researchers quickly pinpoint why a model made a prediction, and whether it is valid.

The paper identifies three examples of how the new method, called MMACE (Molecular Model Agonistic Counterfactual Explanations), can be used to explain why:

a molecule is predicted to permeate the blood-brain barrier
a small molecule is predicted to be soluble
a molecule is predicted to inhibit HIVs

The lab had to overcome some major challenges in developing MMACE. They needed a method that could be adapted for the wide array of machine-learning methods that are used in chemistry. In addition, searching for the most-similar molecule for any given scenario was also challenging because of the sheer number of possible candidate molecules.

Coauthor Aditi Seshadri in White's lab helped solve that problem by suggesting the group adapt the STONED (Superfast traversal, optimization, novelty, exploration, and discovery) algorithm developed at the University of Toronto. STONED efficiently generates similar molecules, the fuel for counterfactual generation. Seshadri is an undergraduate researcher in White's lab and was able to help on the project via a Rochester summer research program called "Discover."

White says his team is continuing to improve MMACE, by trying other databases in their search for most similar molecules, for example, and refining the definition of molecular similarity.

More information: Geemi P. Wellawatte et al, Model agnostic generation of counterfactual explanations for molecules, Chemical Science (2022). DOI: 10.1039/D1SC05259D

Journal information: Chemical Science

Provided by University of Rochester

Citation: Using 'counterfactuals' to verify predictions of drug safety (2022, May 2) retrieved 26 April 2024 from https://phys.org/news/2022-05-counterfactuals-drug-safety.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

AI technique narrowed to only propose candidate molecules that can be produced in a lab

95 shares

Feedback to editors

Using 'counterfactuals' to verify predictions of drug safety

AI deciphers new gene regulatory code in plants and makes accurate predictions for newly sequenced genomes

Unveiling a new quantum frontier: Frequency-domain entanglement

Study details a common bacterial defense against viral infection

Researchers decipher how an enzyme modifies the genetic material in the cell nucleus

Large Hadron Collider experiment zeroes in on magnetic monopoles

Scientists discover higher levels of CO₂ increase survival of viruses in the air and transmission risk

Scientists capture X-rays from upward positive lightning

Scientists learn from caterpillars how to create self-assembling capsules for drug delivery

Scientists suggest using mobile device location data for studying human-wildlife interactions

Experiment reveals strategic thinking in mice

Relevant PhysicsForums posts

Ideas for a project in computational chemistry?

Very confused about Naunyn definition of acid and base

Can you eat the Periodic Table?

New Insight into the Chemistry of Solvents

Separation of KCl from potassium chromium(III) PDTA

Zirconium Versus Zirconium Carbide For Use With Galinstan

AI technique narrowed to only propose candidate molecules that can be produced in a lab

Researchers identify new medicines using interpretable deep learning predictions

Machine learning gets smarter to speed up drug discovery

Machine-learning method creates a learnable chemical grammar to build synthesizable monomers and polymers

A deep learning model rapidly predicts the 3D shapes of drug-like molecules

Machine learning aids in materials design

Scientists discover safer alternative for an explosive reaction used for more than 100 years

Thiol-ene click reaction offers a novel approach to fabricate elastic ferroelectrics

More efficient molecular motor widens potential applications

A shortcut for drug discovery: Novel method predicts on a large scale how small molecules interact with proteins

Freeze casting—a guide to creating hierarchically structured materials

Synthesis of two new carbides provides perspective on how complex carbon structures could exist on other planets

Medical Xpress

Tech Xplore

Science X

Using 'counterfactuals' to verify predictions of drug safety

AI deciphers new gene regulatory code in plants and makes accurate predictions for newly sequenced genomes

Unveiling a new quantum frontier: Frequency-domain entanglement

Study details a common bacterial defense against viral infection

Researchers decipher how an enzyme modifies the genetic material in the cell nucleus

Large Hadron Collider experiment zeroes in on magnetic monopoles

Scientists discover higher levels of CO₂ increase survival of viruses in the air and transmission risk

Scientists capture X-rays from upward positive lightning

Scientists learn from caterpillars how to create self-assembling capsules for drug delivery

Scientists suggest using mobile device location data for studying human-wildlife interactions

Experiment reveals strategic thinking in mice

Relevant PhysicsForums posts

Related Stories

AI technique narrowed to only propose candidate molecules that can be produced in a lab

Researchers identify new medicines using interpretable deep learning predictions

Machine learning gets smarter to speed up drug discovery

Machine-learning method creates a learnable chemical grammar to build synthesizable monomers and polymers

A deep learning model rapidly predicts the 3D shapes of drug-like molecules

Machine learning aids in materials design

Recommended for you

Scientists discover safer alternative for an explosive reaction used for more than 100 years

Thiol-ene click reaction offers a novel approach to fabricate elastic ferroelectrics

More efficient molecular motor widens potential applications

A shortcut for drug discovery: Novel method predicts on a large scale how small molecules interact with proteins

Freeze casting—a guide to creating hierarchically structured materials

Synthesis of two new carbides provides perspective on how complex carbon structures could exist on other planets

Newsletter sign up

Donate and enjoy an ad-free experience