Credit: Michele Ceriotti / EPFL

Many drugs today are produced as powdered solids. But to fully understand how the active ingredients will behave once inside the body, scientists need to know their exact atomic-level structure. For instance, the way molecules are arranged inside a crystal has a direct impact on a compound's properties, such as its solubility. Researchers are therefore working hard to develop technologies that can easily identify the exact crystal structures of microcrystalline powders.

A team of EPFL scientists has now written a machine-learning program that can predict, in record time, how will respond to an applied magnetic field. This can be combined with nuclear magnetic resonance (NMR) spectroscopy to determine the exact location of atoms in complex organic compounds. This can be of huge benefit to pharmaceutical companies, which must carefully monitor their ' structures to meet requirements for patient safety. Their research has been published in Nature Communications.

Breakneck speeds with AI

NMR spectroscopy is a well-known and highly efficient method for probing the magnetic fields between atoms and determining how neighboring atoms interact with each other. However, full crystal determination by NMR spectroscopy requires extremely complicated, time-consuming calculations involving quantum chemistry – nearly impossible for molecules with very intricate structures.

But the program developed at EPFL can overcome these obstacles. The scientists trained their AI model on molecular structures taken from structural databases. "Even for relatively simple molecules, this model is almost 10,000 times faster than existing methods, and the advantage grows tremendously when considering more complex compounds," says Michele Ceriotti, head of the Laboratory of Computational Science and Modeling at EPFL's School of Engineering and co-author of the study. "To predict the NMR signature of a crystal with nearly 1,600 atoms, our technique – ShiftML – requires about six minutes; the same feat would have taken 16 years with conventional techniques."

This new program will make it possible to use completely different approaches that will be faster and allow access to larger molecules. "This is really exciting because the massive acceleration in computation times will allow us to cover much larger conformational spaces and correctly determine structures where it was just not previously possible. This puts most of the complex contemporary drug molecules within reach," says Lyndon Emsley, head of the Laboratory of Magnetic Resonance at EPFL's School of Basic Sciences and co-author of the study.

The program is now freely available online. "Anyone can upload a molecule and get its NMR signature in just a few minutes," says Ceriotti.

More information: Federico M. Paruzzo et al. Chemical shifts in molecular solids by machine learning, Nature Communications (2018). DOI: 10.1038/s41467-018-06972-x

Journal information: Nature Communications