Interactive communication technology for human symbiotic robot to enhance natural communication with humans
Hitachi today announced the development of interactive communication technology for the human symbiotic robot EMIEW2, which selects the optimal answer and explanation based on the subject and attributes included in a question, then based on body movement of the other party, such as nodding or tilting of the head to the side, estimates the level of comprehension to give an even more natural response. This technology enables even more flexible responses in relation to questions to realize smooth communication between humans and robots.
Hitachi has been developing human symbiotic robotics technology since it developed EMIEW in 2005. First announced in 2007, EMIEW2 has realized motor functions such as 2-wheel autonomous locomotion at 6 km/h, around the same pace as a fast walking person, predicting and avoiding collision, and intelligent functions such as distinguishing speech from background noise using 14 microphones, and identifying objects from information on the Internet, and guiding the enquirer to the object.
In the evolution of human symbiotic robots, free communication between humans and robots is the most important technology, and much research and development has been conducted in the area. Voice recognition, contents comprehension, response construction and voice synthesis technologies are required for technologies in free communication. In recent years, technology has been implemented such as in the mobile phone, where the subject is estimated from speech and a corresponding response is provided to the speaker. In robots, however, independent technology development was necessary as conversation is conducted at a distance with no hands-on operation by the speaker. This time, two technologies contributing to progress in robotics conversational function were developed and mounted in EMIEW2. Details of the technologies are as follow.
(1) Select the optimal response from several words included in the enquiry
The necessary words and word order required to identify the subject and attributes from prepared questions, are learnt and recorded in a data base. Technology was developed to recognize the subject and attributes of a question using voice recognition to identify the word order and comparison with the database. With this technology, selection of the optimal response is realized for the subject and attributes of the question posed. Deep Learning, a machine learning method receiving much attention in the field of recognition, was used to enable a high level of recognition.
(2) Ascertain the enquirer's comprehension level from movement such as nodding and tilting of the head
Video images of EMIEW2 in dialogue with humans are pre-analyzed to study body movements accompanying responses. In actual conversation, EMIEW2 captures the movement of the enquirer with an internal camera, and identifies movement such as nodding or tilting of the head to the side. Technology was developed to determine the enquirer's level of comprehension by comparing the actual response to the expected response to EMIEW2's reply. Even more human-like conversation can be achieved by understanding the enquirer's level of comprehension in relation to nature of the reply.
Employing these two technologies, EMIEW2 is able to respond with the optimal answer to freely posed questions by recognizing the subject and attributes, and further respond accordingly by watching body movement, facilitating smoother conversation.
Hitachi will continue to promote developments towards improving the practicality of human symbiotic service robots supporting humans, including interactive communication technology.