Augmented tongue ultrasound for speech therapy

October 13, 2017, CNRS
Augmented tongue ultrasound for speech therapy
Example of tongue model animations of the GIPSA-Lab articulatory talking head from ultrasound images, using the Integrated Cascaded Gaussian Mixture Regression algorithm for [ata] (top) and [uku] (bottom) sequences. Thomas Hueber / GIPSA-Lab. Credit: CNRS/Université Grenoble Alpes / Grenoble INP

A team of researchers in the GIPSA-Lab (CNRS/Université Grenoble Alpes/Grenoble INP) and at INRIA Grenoble Rhône-Alpes has developed a system that can display the movements of tongues in real time. Captured using an ultrasound probe placed under the jaw, these movements are processed by a machine-learning algorithm that controls an "articulatory talking head." As well as the face and lips, this avatar shows the tongue, palate and teeth, which are usually hidden inside the vocal tract. This "visual biofeedback" system, which ought to be easier to understand and therefore should produce better correction of pronunciation, could be used for speech therapy and for learning foreign languages. This work is published in the October 2017 issue of Speech Communication.

For a person with an articulation disorder, speech therapy partly uses repetition exercises: the practitioner qualitatively analyzes the patient's pronunciations and orally explains, using drawings, how to place articulators, particularly the tongue: something are generally unaware of. How effective therapy is depends on how well the patient can integrate what they are told. It is at this stage that "visual biofeedback" systems can help. They let patients see their articulatory movements in real time, and in particular how their tongues move, so that they are aware of these movements and can correct pronunciation problems faster.

For several years, researchers have been using ultrasound to design biofeedback systems. The image of the tongue is obtained by placing under the jaw a probe similar to that used conventionally to look at a heart or fetus. This image is sometimes deemed to be difficult for a patient to use because it is not very good quality and does not provide any information on the location of the palate and teeth. In this new work, the present team of researchers propose to improve this visual feedback by automatically animating an articulatory talking head in from ultrasound images. This virtual clone of a real speaker, in development for many years at the GIPSA-Lab, produces a contextualized—and therefore more natural—visualization of articulatory movements.

Credit: CNRS

The strength of this new system lies in a machine learning algorithm that researchers have been working on for several years. This algorithm can (within limits) process articulatory movements that users cannot achieve when they start to use the system. This property is indispensable for the targeted therapeutic applications. The algorithm exploits a probabilistic model based on a large articulatory database acquired from an "expert" speaker capable of pronouncing all of the sounds in one or more languages. This model is automatically adapted to the morphology of each new user, over the course of a short system calibration phase, during which the patient must pronounce a few phrases.

This system, validated in a laboratory for healthy speakers, is now being tested in a simplified version in a clinical trial for patients who have had tongue surgery. The researchers are also developing another version of the , where the articulatory talking head is automatically animated, not by ultrasounds, but directly by the user's voice.

Explore further: Baboon vocalizations contain five vowel-like sounds comparable to those of human speech

More information: Thomas Hueber et al. Speaker-Adaptive Acoustic-Articulatory Inversion Using Cascaded Gaussian Mixture Regression, IEEE/ACM Transactions on Audio, Speech, and Language Processing (2015). DOI: 10.1109/TASLP.2015.2464702

Related Stories

Ultrasound guides tongue to pronounce 'r' sounds

October 27, 2014

Using ultrasound technology to visualize the tongue's shape and movement can help children with difficulty pronouncing "r" sounds, according to a small study by NYU's Steinhardt School of Culture, Education, and Human Development ...

Babies need free tongue movement to decipher speech sounds

October 12, 2015

Inhibiting infants' tongue movements impedes their ability to distinguish between speech sounds, researchers with the University of British Columbia have found. The study is the first to discover a direct link between infants' ...

Recommended for you

Ready-to-use recipe for turning plant waste into gasoline

September 25, 2018

Bioscience engineers at KU Leuven, Belgium, already knew how to make gasoline in the laboratory from plant waste such as sawdust. Now, the researchers have developed a road map, as it were, for industrial cellulose gasoline.


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.