Scientists from MIPT (Moscow Institute of Physics and Technology) and MSU (Moscow State University), under the leadership of Yan Ivanenkov, developed the first computer model predicting agrochemical activity, the beneficial influence of simple molecules on plants. With the help of an independent test set and the results of their own study, they showed that the model has a high predictive power. The work is published in the scientific journal Phytochemistry.
To construct the model, the authors used methods of machine learning, in particular Kohonen self-organizing maps. The sample used 1,800 carefully selected agrochemicals. As information sources, the authors used patents, scientific publications and specialized databases. It is important to note that the model was also able to predict the activity class of the molecules—specifically what impact they would have on the plant. This prediction was had an 87 percent degree of accuracy; the prediction accuracy of the molecules activity was 67 percent.
The molecules of interest to agrochemistry can be divided into two categories: pesticides (which fight against insects, weeds and fungi) and plant growth regulators (which stimulate or inhibit plant growth). In order to discover a new active molecule from a group, the scientists conduct costly experiments - they synthesize a large number (usually several thousand) molecules, and then check their impact on the cells or whole plants. However, in a significant percentage of cases, such experiments do not produce desirable results—at best, active molecules amount only to a few tens. In other words, the task now is to use the significantly reduced number of molecules (compared with their initial number) that are available for further experiments. This will significantly reduce both the time and financial costs in the search of active molecules.
In their work, the authors have used the image of a chemical space in which each molecule is described as a set of specific parameters (molecular descriptors) for modeling. The value of such a descriptor reflects a particular property of the molecule—its solubility, size, polar surface area, etc. Each molecule in the chemical space is defined (coded) by a set of such parameters as a point having certain coordinates on the plane.
Using the Kohonen algorithm, without any teacher, the researchers reduced the dimensionality of this data with the least error and visualized the result in a form convenient for analysis of the two-dimensional map, on which they highlighted one by one the areas occupied by molecules of different categories. Then, using this map, they evaluated the classification ability of the model. If the ability is high (for example, large-scale tasks in which it is greater than 70 percent), then the model can be tested with the use of an independent test set of molecules that have not been involved in the learning process. That is what the authors of the work achieved, clearly demonstrating that their model predicts the specific activity of new molecules, relating them to one of the commonly accepted categories: herbicides, plant growth regulators, etc.
"It is important to note that the model has good differential predictive power, and it is the first one in the field of agricultural chemistry built with the use of such an impressive learning sample set. In the course of work, we, together with colleagues from the Laboratory for the Development of Innovative Drugs, were able to test the model using the real test results that we obtained ourselves. In the future, we plan to enhance the learning model and improve its predictive ability—possibly with the use of other machine learning algorithms," said Yan Ivanenkov, the lead author and head of the MIPT's Laboratory of Medical Chemistry and Bioinformatics.
In the future, similar computational models will significantly reduce the cost of the search for new active molecules and contribute to the understanding of mechanisms of their work.
Explore further: Search engine for more accurate and fast recognition of metabolites
Nikolay A. Bushkov et al, Computational insight into the chemical space of plant growth regulators, Phytochemistry (2016). DOI: 10.1016/j.phytochem.2015.12.006