September 10, 2007 feature

Machines might talk with humans by putting themselves in our shoes

By Lisa Zyga , Phys.org

While robots can do some remarkable things, they don't yet possess the gift of gab. Since the 1970s, researchers have been trying to develop a speech-based human-machine interface, but improvements are gradual, and some fear that the performance of current systems may not reach an adequate level for real-world applications.

Roger Moore, a computer scientist at the University of Sheffield in the UK, thinks that the current bottom-up architecture of speech-based human-machine interactions may be flawed. He is concerned because, although the quantity of training data for machines has increased exponentially, machines are still poor at understanding accented or conversational speech, and lack individuality and expression when speaking.

Moore has recently suggested an alternative model for speech-based human-machine interaction called PRESENCE (PREdictive SENsorimotor Control and Emulation). While the conventional reductionist architecture views spoken language as a chain of transformations from the mind of the speaker to the mind of the listener, PRESENCE takes a more integrative approach. As Moore explains, PRESENCE focuses on a recursive feedback control structure, where the machine empathizes with the human by imagining itself in the human’s position, and then changes its speech patterns accordingly.

“The main difference between PRESENCE and current approaches to spoken language technology is that it offers the possibility of, one, unifying the processes of speech recognition and generation (thereby reducing the number of parameters that have to be estimated in setting up a system) and, two, linking low-level speech processing behaviors to high-level cognitive behaviors,” Moore told PhysOrg.com. “This should give a PRESENCE-based system a considerable advantage over more conventional systems that treat such processes as independent components, and then struggle to integrate them into a coherent overall system.”

Moore’s model is inspired by recent results in neurobiology—such as the communicative behavior of all living systems, and the special cognitive abilities of humans—that aren’t directly related to speech. Nevertheless, the results have provided a number of implications for human-machine speech, such as the strong relationship between sensor and motor activity, and the power of negative feedback control and memory to predict and anticipate future events.

“A key idea behind the PRESENCE architecture is that behavior is driven by underlying beliefs, desires and intentions,” Moore explained. “As a consequence, behavior is interpreted with respect to one organism’s understanding of another organism’s beliefs, desires and intentions. That is, the ‘meaning’ of an observed action is derived from the estimated beliefs, desires and intentions that lie behind it—an individual is only able to make sense of another’s actions because they themselves can perform those actions. This is precisely a manifestation of the empathetic or mirror relationships that can exist between conspecifics (members of the same species).”

In a preliminary investigation, Moore constructed a humanoid robot called “ALPHA REX” that uses the PRESENCE hierarchical structure to demonstrate the relatively simple task of human-machine synchronization. As a human uttered the words “one, two” spoken at regular intervals, the robot generated taps. An overall control loop generated an error signal, which in turn modified the robot’s tapping rhythm until it matched the human’s words. Synchronization occurred by the eighth count, whereas a conventional model would require the robot to compute complex analytical solutions and suffer system delays. Further, because ALPHA REX could anticipate the human’s behavior, it tapped one extra time after the human ceased counting.

Discover the latest in science, tech, and space with over 100,000 subscribers who rely on Phys.org for daily insights. Sign up for our free newsletter and get updates on breakthroughs, innovations, and research that matter—daily or weekly.

While it sounds simple, these kinds of coordination, reaction, and prediction abilities are necessary for the PRESENCE model, where behavior is quickly altered in response to the environment in order to achieve a desired state. As Moore explains, PRESENCE is less about speaking or listening, but about the human and machine interacting to meet each other’s needs. Again, this is in sharp contrast to conventional models that rely on the breakdown of components such as speech recognition, generation and dialogue.

Future machines that use PRESENCE could provide a variety of applications, such as robot companions or hands-free, eyes-free information retrieval. Moore predicts that PRESENCE machines could produce appropriate vocal intonations, volume levels, and a degree of emotion that is absent in current systems. He even suggests that the new machines could help unify currently divergent fields, such as speech science and technology; natural, life and computer sciences; and provide insight into fields in neurobiology that inspired PRESENCE itself.

Finally, Moore explains that it is very difficult to predict the speed and degree of progress in the future of human-machine speech.

“If we simply continue with the current research paradigm (which is mainly training on more data),” Moore said, “then for automatic speech recognition to compete with alternative technologies (e.g. keyboards etc.), it would need to be half as good as human speech recognition (i.e. it doesn’t need to be ‘super-human’)—and that is five times better than it is today. And the time until this would happen? In about 20 years if progress of the past 10 years can be sustained, or, if it can’t (which is most likely), then [possibly] never!”

Citation: Moore, Roger K. “PRESENCE: A Human-Inspired Architecture for Speech-Based Human-Machine Interaction.” IEEE Transactions on Computers, Vol. 56, No. 9, September 2007.

Citation: Machines might talk with humans by putting themselves in our shoes (2007, September 10) retrieved 28 December 2024 from https://phys.org/news/2007-09-machines-humans.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Machines might talk with humans by putting themselves in our shoes

Scientists use machine learning to develop an opener for a molecular can

Scientists uncover insights into neuron function by simultaneously measuring two key signals in living animals

Light-driven method creates molecular fit that would otherwise be impossible

NASA's Parker Solar Probe survives close brush with the sun's scorching surface

Convergent evolution: stick and leaf insects share 20 body features

Advancing unidirectional heat flow: The next era of quantum thermal diodes

Researchers discover role of absorptive aerosols in wintertime haze formation

Study sheds more light on the nature of compact symmetric object DA 362

Numerical simulations show how the classical world might emerge from the many-worlds universes of quantum mechanics

Astronauts face unique visual challenges at lunar south pole

Relevant PhysicsForums posts

CSS: How do I control the amount of space before/after lines of text?

Caching: Actual role of byte offset field in the cache address?

Taps for simple IIR Filter in GNU Radio

Can Fortran 77 Code Be Used to Debug Python Code for Solving ODEs Using Radau5?

Is there any third-party software that can check this on a gpu?

Applying Accelerated Raymarching to Reduce Rendering Time

Wolves reintroduced to Isle Royale temporarily affect other carnivores, humans have influence as well

The horrifying human cost of big sporting events

New calculation approach allows more accurate predictions of how atoms ionize when impacted by high-energy electrons

Why sexual violence against men by women needs to be 'called out' too

Math makes finding bat roosts much easier, our research shows

The more medals Canadian athletes win, the fewer Canadians participate in organized sport

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Medical Xpress

Tech Xplore

Science X

Machines might talk with humans by putting themselves in our shoes

Scientists use machine learning to develop an opener for a molecular can

Scientists uncover insights into neuron function by simultaneously measuring two key signals in living animals

Light-driven method creates molecular fit that would otherwise be impossible

NASA's Parker Solar Probe survives close brush with the sun's scorching surface

Convergent evolution: stick and leaf insects share 20 body features

Advancing unidirectional heat flow: The next era of quantum thermal diodes

Researchers discover role of absorptive aerosols in wintertime haze formation

Study sheds more light on the nature of compact symmetric object DA 362

Numerical simulations show how the classical world might emerge from the many-worlds universes of quantum mechanics

Astronauts face unique visual challenges at lunar south pole

Relevant PhysicsForums posts

Related Stories

Wolves reintroduced to Isle Royale temporarily affect other carnivores, humans have influence as well

The horrifying human cost of big sporting events

New calculation approach allows more accurate predictions of how atoms ionize when impacted by high-energy electrons

Why sexual violence against men by women needs to be 'called out' too

Math makes finding bat roosts much easier, our research shows

The more medals Canadian athletes win, the fewer Canadians participate in organized sport

Recommended for you

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Newsletter sign up

Donate and enjoy an ad-free experience