Fujitsu today announced the development of the world's first handwriting recognition technology by utilizing AI technology modeled on human brain processes to surpass a human equivalent recognition rate of 96.7%, that was established at a conference. Fujitsu had previously achieved top-level accuracy in this field, as demonstrated by taking first place, with a recognition rate of 94.8%, at a handwritten Chinese character recognition contest held at the International Conference on Document Analysis and Recognition (ICDAR), a top-level conference in the document image processing field.
However, in order to further increase recognition accuracy, a new mechanism for studying the diversity of character deformations was required. Now, with a focus on a hierarchical model of expanded connections between neurons, a model based on the human brain which grasps the features of the characters, Fujitsu has developed a technology to automatically create numerous patterns of character deformation from the character's base pattern, thereby "training" this hierarchical neural model. Using this method, Fujitsu has achieved an accuracy rate of 96.7%, surpassing the human equivalent recognition rate of 96.1% for handwritten Chinese characters. Fujitsu expects that this technology will enable further automation of computer input and recognition.
Ordinarily, while humans can easily recognize media such as characters, images and sounds, for computers this recognition is much more difficult, due to both the many variations in shape, brightness and so on of the object to be recognized, as well as the existence of similar objects. This has become a central problem in artificial intelligence research. Fujitsu has decades of experience in character recognition, with commercialized technologies used in such areas as Japan's finance and insurance fields for Japanese language, as well as a Chinese character recognition technology used by the Chinese government for 800 million handwritten census forms. Fujitsu started research using artificial intelligence based on deep learning for character recognition in 2010. In 2013, the character recognition technology developed on the basis of this artificial intelligence took first place (recognition rate of 94.8%) at a handwritten Chinese character recognition contest held at a top-level international contest in the document image processing field, achieving the highest accuracy in the field.
With character recognition technology, the goal is to learn and store the features of the many character patterns thought to be used by humans when recognizing characters, using a model of connected hierarchies based on human neurons. When a character image is input, the first layer of the model perceives the simple features of the character, and then the next layer perceives the complex features of the character. In this way, the features effective for differentiating characters are extracted in an automatic and hierarchical fashion, and then the results of the learning process, including which features (neurons) the model reacted to, are accumulated. When attempting to recognize a character, the features of the input character are extracted in the same way as in the learning process, and the character is identified and recognition results output on the basis of which features (neurons) reacted as determined by the learning process. In order to further increase the accuracy of recognition, there was a need for a new effort to study the diversity of character deformations. This is because while Fujitsu had achieved the top level of accuracy in the field, it was not at a level comparable to human recognition activity (a recognition rate of 96.1%).
Now, by increasing the number of connections between the neurons in the hierarchical model by over fifty times, Fujitsu has developed a technology to automatically produce many varieties of deformed character patterns for learning. Using this method, the model is able to learn more meticulously, and achieve a recognition rate of 96.7% to surpass the human equivalent rate of 96.1%, in recognizing handwritten Chinese characters. The features of this newly developed technology are listed below.
1. Expanding the scale of the hierarchical model
Fujitsu has expanded the scale of the connections between neurons in the hierarchical model used in the character recognition process, raising recognition accuracy by increasing the number of connections from 2.8 million used in the previous technology (recognition rate 94.8%) to 150 million, in order to fine-tune the study of deformations (Figure 1, Figure 2).
2. Generating diverse character samples based on three-dimensional random deformation
There are about 3,800 Chinese characters to be recognized, making it extremely difficult to collect real-world patterns of deformation for each character. Therefore, Fujitsu has developed a technology to randomly deform existing character samples to automatically create all sorts of character samples for learning. This made it possible to have the hierarchical model study a multitude of different types of deformed character patterns (Figure 3).
With previous methods, because they only randomized the character's position in two dimensions, differences in the brightness of parts of the background or parts of the character (strokes) and localized differences created problems. To address this, Fujitsu devised a character sample generation technology based on random deformations in three dimensions. By adding the grey value of each image element as a Z-axis parameter to the existing X and Y axes of the character pattern image, they were able to generate a variety of deformed patterns.
With this newly developed technology, Fujitsu achieved a recognition rate of 96.7% for handwritten Chinese characters, surpassing the human equivalent rate of 96.1%. Fujitsu anticipates that this technology will further automate all sorts of computer input and recognition tasks.
Fujitsu is aiming for the practical application of this technology in fiscal 2015, while also further increasing the accuracy of character recognition technology and expanding its use to the recognition of media other than written characters, such as pictures and voice. In addition, Fujitsu is also studying the applications of this character recognition technology to many other languages, such as Japanese, alphabet-based languages, and numerals.
Explore further: Fujitsu develops ring-type wearable device capable of text input by fingertip