A new method of modeling drug-target interactions fixes a detrimental bias of past techniques

June 28, 2017, Agency for Science, Technology and Research (A*STAR), Singapore

"Drug discovery is a very long process. At each stage, you may find your drug is not good enough and you need to seek another candidate," explains A*STAR's Xiao-Li Li. His team won 'best paper' at the 2016 International Conference on Bioinformatics for a novel approach to correcting an intrinsic problem with machine learning methods.

Computer simulation, or 'in silico' techniques, can improve accuracy and reduce the drawn out, hugely expensive road to bringing a to market—averaging more than 12 years and $US1.8 billion.

Many computer simulations however first require 'training' on datasets of known drugs and their targets. This data can include additional information on 3-D structure, chemical composition, and other molecular properties. Drawing on trends from this database of known data, the simulation can then predict the interactions of unknown molecules—leading to and new proteins.

However, of all the drugs and targets in the database, only certain combinations will interact. Potential pairings are far outweighed by non-interacting pairs referred to as 'between-class imbalance'. Further imbalance is present in the form of different and unequal subtypes of interaction, dubbed 'within-class imbalance'.

"Any computational models that are designed to optimize accuracy will be biased and will tend to classify unknown pairs into majority or non-interaction class," says Li. "Majority classes are better represented in data than minority interaction classes—this skews these models and produces errors. Data imbalance is a challenging issue."

Li's team at the A*STAR Institute for Infocomm Research, sought to overcome this by developing an 'imbalance-aware' algorithm that more accurately predicted drug-target interactions based on a database of 12,600 known interactions and around 18 million known non-interacting pairs. The algorithm was designed to better recognize underrepresented interaction groups and enhance the data within them.

By improving the ability of the computer model to focus on the most useful data (the interactions), the team created a system that outperformed existing modeling techniques, predicting new, unknown drug-target interactions with high accuracy.

The future of machine learning depends on artificial intelligence and advanced learning such as 'deep learning.' Nevertheless, as Li adds: "data is key. In order to further enhance our predictive capability, the first thing we can do is collect more relevant data about drugs and targets."

Explore further: Molecular dynamics, machine learning create 'hyper-predictive' computer models

More information: Ali Ezzat et al. Drug-target interaction prediction via class imbalance-aware ensemble learning, BMC Bioinformatics (2016). DOI: 10.1186/s12859-016-1377-y

Related Stories

Machine learning models for drug discovery

April 10, 2017

IBM today announced that its scientists have been granted a patent on machine learning models to predict therapeutic indications and side effects from various drug information sources. IBM Research has implemented a cognitive ...

Recommended for you

Nanodiamonds as photocatalysts

October 19, 2018

Climate change is in full swing and will continue unabated as long as CO2 emissions continue. One possible solution is to return CO2 to the energy cycle: CO2 could be processed with water into methanol, a fuel that can be ...

Producing defectless metal crystals of unprecedented size

October 19, 2018

A research group at the Center for Multidimensional Carbon Materials, within the Institute for Basic Science (IBS), has published an article in Science describing a new method to convert inexpensive polycrystalline metal ...

Shining light on the separation of rare earth metals

October 18, 2018

Inside smartphones and computer displays are metals known as the rare earths. Mining and purifying these metals involves waste- and energy-intense processes. Better processes are needed. Previous work has shown that specific ...


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.