Machine learning improves accuracy of particle identification at LHC
Scientists from the Higher School of Economics have developed a method that allows physicists at the Large Hadron Collider (LHC) to separate between various types of elementary particles with a high degree of accuracy. The results were published in the Journal of Physics.
One of major unsolved problems of modern physics is the predominance of matter over antimatter in the universe. They both formed within a second after the Big Bang, in presumably equal fractions, and physicists are trying to understand where antimatter has disappeared to. Back in 1966, Russian scientist Andrei Sakharov suggested that the imbalance between matter and antimatter appeared as a result of CP violation, i.e., an asymmetry between particles and antiparticles. Thus, only particles remained after their annihilation (mutual destruction) of resulting unbalanced contributions.
The Large Hadron Collider beauty experiment (LHCb) studies unstable particles called B-mesons. Their decays demonstrate the clearest asymmetry between matter and antimatter. The LHCb consists of several specialised detectors, specifically, calorimeters to measure the energy of neutral particles. Calorimeters also identify different types of particles. These are done by search and analysis of corresponding clusters of energy deposition. It is, however, not easy to separate signals from two types of photons—primary photons and photons from energetic π0 meson decay. HSE scientists developed a method that to classify these two with high accuracy.
The authors of the study applied artificial neural networks and gradient boosting (a machine-learning algorithm) to classify energies collected in the individual cells of the energy cluster.
"We took a five-by-five matrix with a centre at the calorimeter cell with the largest energy," says Fedor Ratnikov, one of the study's authors and a leading researcher in the HSE Laboratory of Methods for Big Data Analysis. "Instead of analysing the special characteristics constructed from raw energies in cluster cells, we pass these raw energies directly to the algorithm for analysis. The machine was able to make sense of the data better than a person."
Compared with the previous method of data pre-processing, the new machine-learning-based method has quadrupled quality metrics for the identification of particles on the calorimeter. The algorithm improved the classification quality from 0.89 to 0.97; the higher this figure is, the better the classifier works. With a 98 percent effectiveness rate of initial photon identification, the new approach has lowered the false photon identification rate from 60 percent to 30 percent.
The proposed method is unique in that it allows for elementary particles to be identified without initially studying the characteristics of the cluster being analysed. "We pass the data to machine learning in the hope that the algorithm finds correlations we might not have considered. The approach obviously worked out in this case," Fedor Ratnikov concludes.