Data-mining for crystal 'gold' at SLAC's X-ray laser
A new tool for analyzing mountains of data from SLAC's Linac Coherent Lightsource (LCLS) X-ray laser can produce high-quality images of important proteins using fewer samples. Scientists hope to use it to reveal the structures and functions of proteins that have proven elusive, as well as mine data from past experiments for new information.
"Such analytical tools might be as important to LCLS experiments as better detectors, sample-delivery systems and other instruments," said Uwe Bergmann, director of LCLS and a member of a research collaboration that has tested the software. "Continued improvements in methods like this will be critical for very precious samples – particularly when time or the amount of sample is limited."
The software package, known as the Computational Crystallography Toolbox for X-ray Free-electron Lasers or cctbx.xfel, was developed as a part of an international project to study proteins involved in oxygen-producting stages in photosynthesis, but can be applied to other protein studies as well. It should be especially helpful in analyzing proteins that are difficult to crystallize in large quantities for experiments, including many relevant to fighting disease. The software toolbox is freely available online, and users can get help online or via email.
Detailed in a paper published in the March 16 edition of Nature Methods, the new tool is designed to glean more information from protein samples based on a customized, improved analysis of LCLS X-ray images.
The software finds new ways to precisely match LCLS data with Bragg's Law, the 101-year-old discovery that describes the mathematics of how X-rays project the molecular blueprints of tiny crystallized samples onto a detector. It does so by factoring in painstaking measurements of the surfaces of LCLS X-ray detectors.
The software also analyzes spots in the X-ray images that other tools reject or overlook, such as streaked, curved, dim or fuzzy features, increasing the number of usable images. "In addition, it is designed to resolve sharper details of the atomic structure," Sauter said.
The developers adapted the software from LABELIT, a tool Sauter released a decade ago to analyze data from synchrotrons, the most widely used X-ray facilities for studying crystallized biological samples.
X-ray free-electron lasers such as LCLS, with ultrashort X-ray pulses that are millions of times brighter than synchrotron X-rays, are proving a powerful new force in solving molecular mysteries that synchrotrons cannot, but they bring a new set of scientific challenges.
At synchrotrons, scientists typically study frozen crystals one at a time, rotating each one slowly and taking multiple X-ray images.
LCLS can study smaller crystals and under more natural conditions, but it requires a much larger number of crystals, which are typically suspended in a liquid or gel and jetted into the path of the X-rays. Because the crystals are tumbling randomly when the X-ray snapshots are taken and only one image can be taken of each crystal, scientists must gather tens of thousands of high-quality images to get a complete picture of a protein structure. A recent experiment at LCLS collected enough data to fill about 2,335 standard Blu-ray video discs, Sauter said.
Junko Yano, a staff scientist at Berkeley Lab whose research team includes Berkeley Lab senior scientist Vittal Yachandra, has used the new data-analysis tool to study the molecular machinery at work in photosynthesis. She said even in cases where it is easy to produce crystals and generate a lot of data, the software could improve the resolution of protein structures by capturing more details from the highest-quality crystals.
"With many biological systems we may not have this luxury of easily producing a lot of crystals," she added, "so this will help us to minimize the amount of samples we need to collect high-quality data, both at LCLS and at other free-electron X-ray laser facilities that are coming on line."