March 9, 2020

Machine-learning technology to track odd events among LHC data

A simulated CMS collision where a long-lived particle is produced together with other 'regular' jets. The long-lived particle travels for a short distance before it decays, creating particles that appear displaced from the point where the LHC beams collided. Credit: CERN

Nowadays, artificial neural networks have an impact on many areas of our day-to-day lives. They are used for a wide variety of complex tasks, such as driving cars, performing speech recognition (for example, Siri, Cortana, Alexa), suggesting shopping items and trends, or improving visual effects in movies (e.g., animated characters such as Thanos from the movie Infinity War by Marvel).

Traditionally, algorithms are handcrafted to solve complex tasks. This requires experts to spend a significant amount of time to identify the optimal strategies for various situations. Artificial neural networks—inspired by interconnected neurons in the brain—can automatically learn from data a close-to-optimal solution for the given objective. Often, the automated learning or "training" required to obtain these solutions is "supervised" through the use of supplementary information provided by an expert. Other approaches are "unsupervised" and can identify patterns in the data. The mathematical theory behind artificial neural networks has evolved over several decades, yet only recently have we developed our understanding of how to train them efficiently. The required calculations are very similar to those performed by standard video graphics cards (that contain a graphics processing unit or GPU) when rendering three-dimensional scenes in video games. The ability to train artificial neural networks in a relatively short amount of time is made possible by exploiting the massively parallel computing capabilities of general-purpose GPUs. The flourishing video game industry has driven the development of GPUs. This advancement, along with the significant progress in machine learning theory and the ever-increasing volume of digitised information, has helped to usher in the age of artificial intelligence and "deep learning".

In the field of high energy physics, the use of machine learning techniques, such as simple neural networks or decision trees, have been in use for several decades. More recently, the theory and experimental communities are increasingly turning to the state-of-the-art techniques, such as "deep" neural network architectures, to help us understand the fundamental nature of our Universe. The standard model of particle physics is a coherent collection of physical laws—expressed in the language of mathematics—that govern the fundamental particles and forces, which in turn explain the nature of our visible Universe. At the CERN LHC, many scientific results focus on the search for new "exotic" particles that are not predicted by the standard model. These hypothetical particles are the manifestations of new theories that aim to answer questions such as: why is the Universe predominantly composed of matter rather than antimatter, or what is the nature of dark matter?

Figure 1: Schematic of the network architecture. The upper (orange and blue) sections of the diagram illustrate the components of the network that are used to distinguish jets produced in the decays of long-lived particles from jets produced by other means, trained with simulated data. The lower (green) part of the diagram shows the components that are trained using real collision data. Credit: CERN

Figure 2: An illustration of the performance of the network. The coloured curves represent the performance of different theoretical supersymmetric models. The horizontal axis gives the efficiency for correctly identifying a long-lived particle decay (i.e. the true-positive rate). The vertical axis shows the corresponding false-positive rate, which is the fraction of standard jets mistakenly identified as originating from the decay of a long-lived particle. As an example, we use a point of the red curve where the fraction of genuine long-lived particles that are correctly identified is 0.5 (i.e. 50%). This method misidentifies only one regular jet in every thousand mistakenly as originating from a long-lived particle decay. Credit: CERN

Recently, searches for new particles that exist for more than a fleeting moment in time before decaying to ordinary particles have received particular attention. These "long-lived" particles can travel measurable distances (fractions of millimetres or more) from the proton-proton collision point in each LHC experiment before decaying. Often, theoretical predictions assume that the long-lived particle is undetectable. In that case, only the particles from the decay of the undiscovered particle will leave traces in the detector systems, leading to the rather atypical experimental signature of particles apparently appearing from out of nowhere and displaced from the collision point.

A novel aspect of this study involves the use of data from real collision events, as well as simulated events, to train the network. This approach is used because the simulation—although very sophisticated—does not exhaustively reproduce all the details of the real collision data. In particular, the jets arising from long-lived particle decays are challenging to simulate accurately. The effect of applying this technique, dubbed "domain adaptation," is that the information provided by the neural network agrees to a high level of accuracy for both real and simulated collision data. This behaviour is a crucial trait for algorithms that will be used by searches for rare new-physics processes, as the algorithms must demonstrate robustness and reliability when applied to data.

Figure 3: Histograms of the output values from the neural network for real (black circular markers) and simulated (coloured filled histograms) proton-proton collision data without (left panel) and with (right panel) the application of domain adaptation. The lower panels display the ratios between the numbers of real data and simulated events obtained from each histogram bin. The ratios are significantly closer to unity for the right panel, which indicates an improved understanding of the neural network performance for real collision data, which is crucial to reduce false positive (and false negative!) scientific results when searching for exotic new particles. Credit: CERN

The CMS Collaboration will deploy this new tool as part of its ongoing search for exotic, long-lived particles. This study is part of a larger, coordinated effort across all the LHC experiments to use modern machine techniques to improve how the large data samples are recorded by the detectors and the subsequent data analysis. For example, the use of domain adaptation may make it easier to deploy robust machine-learned models as part of future results. The experience gained from these types of study will increase the physics potential during Run 3, from 2021, and beyond with the High Luminosity LHC.

More information: A deep neural network to search for new long-lived particles decaying to jets: cms-results.web.cern.ch/cms-re … XO-19-011/index.html

Provided by CERN

Recommended

New work reveals the 'quantumness' of gravity

May 1, 2024

Laser excitation of Th-229 nucleus: New findings suggest classical quantum physics and nuclear physics can be combined

Apr 29, 2024

Large Hadron Collider experiment zeroes in on magnetic monopoles

Apr 26, 2024

Scientists capture X-rays from upward positive lightning

Apr 26, 2024

Scientists simulate magnetization reversal of Nd-Fe-B magnets using large-scale finite element models

Apr 26, 2024

First experimental proof for brain-like computer with water and salt

Apr 25, 2024

IRIS beamline at BESSY II gets a new nanospectroscopy end station

Apr 25, 2024

Load comments (0)

Study finds human noise negatively impacts cricket survival and reproduction

1 hour ago

New eco-friendly lubricant additives protect turbine equipment, waterways

1 hour ago

Nanotubes, nanoparticles and antibodies detect tiny amounts of fentanyl

1 hour ago

Bigger brains allow cliff-nesting seagull species to survive and thrive in urban environments

1 hour ago

Oil palm plantations are driving massive downstream impact to watershed

2 hours ago

Centipedes used in traditional Chinese medicine offer leads for kidney treatment

2 hours ago

Physicists arrange atoms in close proximity, paving way for exploring exotic states of matter

2 hours ago

For microscopic organisms, ocean currents act as 'expressway' to deeper depths, study finds

2 hours ago

Targeting friends to induce social contagion can benefit the world, says new research

2 hours ago

Mice navigating a virtual reality environment reveal that walls, not floors, define space

2 hours ago

Human activity is causing toxic thallium to enter the Baltic Sea, finds new study

2 hours ago