June 2, 2017 feature
Researchers investigate decision-making by physical phenomena
In a new study, a team of researchers from Japan has demonstrated that the ultrafast, chaotic oscillatory dynamics in lasers makes these devices capable of decision making and reinforcement learning, which is one of the major components of machine learning. To the best of the researchers' knowledge, this is the first demonstration of ultrafast photonic decision making or reinforcement learning, and it opens the doors to future research on "photonic intelligence."
"In our demonstration, we utilize the computational power inherent in physical phenomena," coauthor Makoto Naruse at the National Institute of Information and Communications Technology in Tokyo told Phys.org. "The computational power of physical phenomena is based on 'infinite degrees of freedom,' and its resulting 'nonlocality of interactions' and 'fluctuations.' It contains completely new computational principles. Such systems provide huge potential for our future intelligence-oriented society. We call such systems 'natural Intelligence' in contrast to artificial intelligence."
In experiments, the researchers demonstrated that the optimal rate at which laser chaos can make decisions is 1 decision per 50 picoseconds (or about 20 decisions per nanosecond)—a speed that is unachievable by other mechanisms. With this fast speed, decision making based on laser chaos has potential applications in areas such as high-frequency trading, data center infrastructure management, and other high-end uses.
The researchers demonstrated the laser's ability by having it solve the multi-armed bandit problem, which is a fundamental task in reinforcement learning. In this problem, the decision-maker plays various slot machines with different winning probabilities, and must find the slot machine with the highest winning probability in order to maximize its total reward. In this game, there is a tradeoff between spending time exploring different slot machines and making a quick decision: exploring may waste time, but if a decision is made too quickly, the best machine may be overlooked.
A key to the laser's ability is combining laser chaos with a decision-making strategy known as "tug of war," so-called because the decision-maker is constantly being "pulled" toward one slot machine or another, depending on the feedback it receives from its previous play. In order to realize this strategy in a laser, the researchers combined the laser with a threshold adjustor whose value shifts so as to play the slot machine with the higher reward probability. As the researchers explain, the laser produces a different output value depending on the threshold value.
"Let us call one of the slot machines 'machine 0' and the other 'machine 1'," said coauthor Songju Kim, at the National Institute for Materials Science in Tsukuba, Japan. "The output of the laser-based decision maker is '0' or '1.' If the signal level of the chaotic oscillatory dynamics is higher than the threshold value (which is dynamically configured), then the output is '0,' and this directly means that the decision is to choose 'machine 0.' If the signal level of the chaotic oscillatory dynamics is lower than the threshold value (which is dynamically configured), then the output is '1,' and this directly means that the decision is to choose 'machine 1.'"
The researchers expect that this system can be scaled up, extended to higher-grade machine learning problems, and lead to new applications of laser chaos in the field of artificial intelligence.
Reinforcement learning involves decision making in dynamic and uncertain environments, and constitutes one important element of artificial intelligence (AI). In this paper, we experimentally demonstrate that the ultrafast chaotic oscillatory dynamics of lasers efficiently solve the multi-armed bandit problem (MAB), which requires decision making concerning a class of difficult trade-offs called the exploration-exploitation dilemma. To solve the MAB, a certain degree of randomness is required for exploration purposes. However, pseudo-random numbers generated using conventional electronic circuitry encounter severe limitations in terms of their data rate and the quality of randomness due to their algorithmic foundations. We generate laser chaos signals using a semiconductor laser sampled at a maximum rate of 100 GSample/s, and combine it with a simple decision-making principle called tug-of-war with a variable threshold, to ensure ultrafast, adaptive and accurate decision making at a maximum adaptation speed of 1 GHz. We found that decision-making performance was maximized with an optimal sampling interval, and we highlight the exact coincidence between the negative autocorrelation inherent in laser chaos and decision-making performance. This study paves the way for a new realm of ultrafast photonics in the age of AI, where the ultrahigh bandwidth of photons can provide new value.
© 2017 Phys.org