Scientists develope unique in silico model to predict agrochemical activity of small-molecule organic compounds

February 22, 2016
The Kohonen model: The green gradient in the background corresponds to the molecules - activators of plant growth from the learning set, the darker areas are inhabited by a large number of molecules. The circles show the molecules of the test (experimental) set. It can be seen that most of the tested molecules lie in the dark areas, which indicates the high predictive ability of the model. Credit: adapted from Bushkov et. al. / Phytochemistry

Scientists from MIPT (Moscow Institute of Physics and Technology) and MSU (Moscow State University), under the leadership of Yan Ivanenkov, developed the first computer model predicting agrochemical activity, the beneficial influence of simple molecules on plants. With the help of an independent test set and the results of their own study, they showed that the model has a high predictive power. The work is published in the scientific journal Phytochemistry.

To construct the model, the authors used methods of machine learning, in particular Kohonen self-organizing maps. The sample used 1,800 carefully selected agrochemicals. As information sources, the authors used patents, scientific publications and specialized databases. It is important to note that the model was also able to predict the activity class of the molecules—specifically what impact they would have on the plant. This prediction was had an 87 percent degree of accuracy; the prediction accuracy of the molecules activity was 67 percent.

The molecules of interest to agrochemistry can be divided into two categories: pesticides (which fight against insects, weeds and fungi) and (which stimulate or inhibit ). In order to discover a new active molecule from a group, the scientists conduct costly experiments - they synthesize a large number (usually several thousand) molecules, and then check their impact on the cells or whole plants. However, in a significant percentage of cases, such experiments do not produce desirable results—at best, amount only to a few tens. In other words, the task now is to use the significantly reduced number of molecules (compared with their initial number) that are available for further experiments. This will significantly reduce both the time and financial costs in the search of active molecules.

In their work, the authors have used the image of a chemical space in which each molecule is described as a set of specific parameters (molecular descriptors) for modeling. The value of such a descriptor reflects a particular property of the molecule—its solubility, size, polar surface area, etc. Each molecule in the chemical space is defined (coded) by a set of such parameters as a point having certain coordinates on the plane.

Dibenzazepine - a plant growth regulator - is one of the molecules which were correctly classified by the model. Credit: Bushkov et. al. / Phytochemistry

Using the Kohonen algorithm, without any teacher, the researchers reduced the dimensionality of this data with the least error and visualized the result in a form convenient for analysis of the two-dimensional map, on which they highlighted one by one the areas occupied by molecules of different categories. Then, using this map, they evaluated the classification ability of the model. If the ability is high (for example, large-scale tasks in which it is greater than 70 percent), then the model can be tested with the use of an independent test set of molecules that have not been involved in the learning process. That is what the authors of the work achieved, clearly demonstrating that their model predicts the specific activity of new molecules, relating them to one of the commonly accepted categories: herbicides, plant growth regulators, etc.

"It is important to note that the model has good differential predictive power, and it is the first one in the field of agricultural chemistry built with the use of such an impressive learning sample set. In the course of work, we, together with colleagues from the Laboratory for the Development of Innovative Drugs, were able to test the model using the real test results that we obtained ourselves. In the future, we plan to enhance the learning and improve its predictive ability—possibly with the use of other machine learning algorithms," said Yan Ivanenkov, the lead author and head of the MIPT's Laboratory of Medical Chemistry and Bioinformatics.

In the future, similar computational models will significantly reduce the cost of the search for new active and contribute to the understanding of mechanisms of their work.

Explore further: Search engine for more accurate and fast recognition of metabolites

More information: Nikolay A. Bushkov et al, Computational insight into the chemical space of plant growth regulators, Phytochemistry (2016). DOI: 10.1016/j.phytochem.2015.12.006

Related Stories

New material lights up when detecting explosives

February 4, 2016

Scientists have created a material which turns fluorescent if there are molecules from explosives in the vicinity. The discovery could improve e.g. airport security - and also it gives us an insight into a rather chaotic ...

Big data model improves prediction of key hospital outcome

February 18, 2016

More than half of hospital deaths in the United States are related to severe infections, or sepsis. Yale researchers developed a prediction model, drawing on "big data" about local patients and using machine-learning methods, ...

The geometry of histamine revealed by Russian scientists

February 8, 2016

A group of scientists from Lomonosov Moscow State University studied histamine molecules in the gas phase using an electron beam. The study involved both experiment and calculations, and the results have been published in ...

Recommended for you

Life's building blocks observed in spacelike environment

December 12, 2017

Where do the molecules required for life originate? It may be that small organic molecules first appeared on earth and were later combined into larger molecules, such as proteins and carbohydrates. But a second possibility ...

Hot vibrating gases under the electron spotlight

December 12, 2017

Natural gas is used in refineries as the basis for products like acetylene. The efficiency of gaseous reactions depends on the dynamics of the molecules—their rotation, vibration and translation (directional movement). ...


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.