Scientists improve deep learning method for neural networks
Researchers from the Institute of Cyber Intelligence Systems at the National Research Nuclear University MEPhI (Russia) have recently developed a new learning model for the restricted Boltzmann machine (a neural network), which optimizes the processes of semantic encoding, visualization and data recognition. The results of this research are published in the journal Optical Memory and Neural Networks.
Today, deep neural networks with different architectures, such as convolutional, recurrent and autoencoder networks, are becoming an increasingly popular area of research. A number of high-tech companies, including Microsoft and Google, are using deep neural networks to design intelligent systems.
In deep learning systems, the processes of feature selection and configuration are automated, which means that the networks can choose between the most effective algorithms for hierarchal feature extraction on their own. Deep learning is characterized by learning with the help of large samples using a single optimization algorithm. Typical optimization algorithms configure the parameters of all operations simultaneously, and effectively estimate every neural network parameter's effect on error with the help of the so-called backpropagation method.
"The neural networks' ability to learn on their own is one of their most intriguing properties," explained Vladimir Golovko, professor at the MEPhI Institute of Cyber Intelligence Systems. "Just like biological systems, neural networks can model themselves, seeking to develop the best possible model of behavior."
In 2006, the sphere of neural network training saw a breakthrough when Geoffrey Hinton published a research paper on pre-training neural networks. He stated that multilayer neural networks could be pre-trained by training one layer at a time with the help of the restricted Boltzmann machine and then fine-tuning them using backpropagation. These networks were named deep belief networks, or DBN.
Golovko analyzed the main issues and paradigms of deep machine learning and suggested a new learning method for the restricted Boltzmann machine. The researcher demonstrated that the classical rule of training this neural network is a particular case of the method he developed.
"American scientists Minsky and Papert once showed that from the point of view of pattern classification, the single layer perceptron with the threshold activation function forms a linear separating surface, which is the reason why it cannot solve the 'exclusive or' problem," Golovko noted. "This led to pessimistic conclusions about the further development of neural networks. However, the last statement is only true for a single layer perceptron with a threshold or a monotonic continuous activation function, for instance, a sigmoid function. When one uses the signal activation function, the single layer perceptron can solve the 'exclusive or' problem, since it can divide the area of ones and zeros into classes with the help of two straight lines."
The research also involved an analysis of the prospects of using deep neural networks for compression, visualization and recognition of data. Moreover, Golovko also suggested a new approach to implementation of semantic encoding, or hashing, which is based on the use of deep auto-associative neural networks.
This deep learning method might be very useful for training search engines' neural networks, the author states, as it will improve the speed of searching for relevant images.
These findings have great practical value: they have already found application in the spheres of computer vision, speech recognition and bioinformatics.