December 21, 2009 feature

Machine Translates Thoughts into Speech in Real Time

By Lisa Zyga , Medical Xpress

(PhysOrg.com) -- By implanting an electrode into the brain of a person with locked-in syndrome, scientists have demonstrated how to wirelessly transmit neural signals to a speech synthesizer. The "thought-to-speech" process takes about 50 milliseconds - the same amount of time for a non-paralyzed, neurologically intact person to speak their thoughts. The study marks the first successful demonstration of a permanently installed, wireless implant for real-time control of an external device.

The study is led by Frank Guenther of the Department of Cognitive and Neural Systems and the Sargent College of Health and Rehabilitation Sciences at Boston University, as well as the Division of Health Science and Technology at Harvard University-Massachusetts Institute of Technology. The research team includes collaborators from Neural Signals, Inc., in Duluth, Georgia; StatsANC LLC in Buenos Aires, Argentina; the Georgia Tech Research Institute in Marietta, Georgia; the Gwinnett Medical Center in Lawrenceville, Georgia; and Emory University Hospital in Atlanta, Georgia. The team published their results in a recent issue of PLoS ONE.

“The results of our study show that a brain-machine interface (BMI) user can control sound output directly, rather than having to use a (relatively slow) typing process,” Guenther told PhysOrg.com.

In their study, the researchers tested the technology on a 26-year-old male who had a brain stem stroke at age 16. The brain stem stroke caused a lesion between the volunteer’s motor neurons that carry out actions and the rest of the brain; while his consciousness and cognitive abilities are intact, he is paralyzed except for slow vertical movement of the eyes. The rare condition is called locked-in syndrome.

Five years ago, when the volunteer was 21 years old, the scientists implanted an electrode near the boundary between the speech-related premotor and primary motor cortex (specifically, the left ventral premotor cortex). Neurites began growing into the electrode and, in three or four months, the neurites produced signaling patterns on the electrode wires that have been maintained indefinitely.

Three years after implantation, the researchers began testing the brain-machine interface for real-time synthetic speech production. The system is “telemetric” - it requires no wires or connectors passing through the skin, eliminating the risk of infection. Instead, the electrode amplifies and converts neural signals into frequency modulated (FM) radio signals. These signals are wirelessly transmitted across the scalp to two coils, which are attached to the volunteer’s head using a water-soluble paste. The coils act as receiving antenna for the RF signals. The implanted electrode is powered by an induction power supply via a power coil, which is also attached to the head.

The signals are then routed to an electrophysiological recording system that digitizes and sorts them. The sorted spikes, which contain the relevant data, are sent to a neural decoder that runs on a desktop computer. The neural decoder’s output becomes the input to a speech synthesizer, also running on the computer. Finally, the speech synthesizer generates synthetic speech (in the current study, only three vowel sounds were tested). The entire process takes an average of 50 milliseconds.

As the scientists explained, there are no previous electrophysiological studies of neuronal firing in speech motor areas. In order to develop an accurate neural coding scheme, they had to rely on an established neurocomputational model of speech motor control. According to this model, neurons in the left ventral premotor cortex represent intended speech sounds in terms of “formant frequency trajectories.”

In an intact brain, these frequency trajectories are sent to the primary motor cortex where they are transformed into motor commands to the speech articulators. However, in the current study, the researchers had to interpret these frequency trajectories in order to translate them into speech. To do this, the scientists developed a two-dimensional formant frequency space, in which different vowel sounds can be plotted based on two formant frequencies (whose values are represented on the x and y axes).

“The study supported our hypothesis (based on the DIVA model, our neural network model of speech) that the premotor cortex represents intended speech as an ‘auditory trajectory,’ that is, as a set of key frequencies (formant frequencies) that vary with time in the acoustic signal we hear as speech,” Guenther said. “In other words, we could predict the intended sound directly from neural activity in the premotor cortex, rather than try to predict the positions of all the speech articulators individually and then try to reconstruct the intended sound (a much more difficult problem given the small number of neurons from which we recorded). This result provides our first insight into how neurons in the brain represent speech, something that has not been investigated before since there is no animal model for speech.”

To confirm that the neurons in the implanted area were able to carry speech information in the form of formant frequency trajectories, the researchers asked the volunteer to attempt to speak in synchrony with a vowel sequence that was presented auditorily. In later experiments, the volunteer received real-time auditory feedback from the speech synthesizer. During 25 sessions over a five-month period, the volunteer significantly improved the thought-to-speech accuracy. His average hit rate increased from 45% to 70% across sessions, reaching a high of 89% in the last session.

Although the current study focused only on producing a small set of vowels, the researchers think that consonant sounds could be achieved with improvements to the system. While this study used a single three-wire electrode, the use of additional electrodes at multiple recording sites, as well as improved decoding techniques, could lead to rapid, accurate control of a speech synthesizer that could generate a wide range of sounds.

“Our immediate plans involve the implementation of a new synthesizer that can produce consonants as well as vowels but remains simple enough for a BMI user to control,” Guenther said. “We are also working on hardware that will greatly increase the number of neurons that are recorded. We expect to tap into at least 10 times as many neurons in the next implant recipient, which should lead to a dramatic improvement in performance.”

Overall, the work marks a milestone in the development of a permanent neural prosthesis that requires no major external hardware beyond a wireless receiver and laptop computer. Previous brain-machine interfaces for communication applications are very slow, producing only about one word per minute. The new system has the potential to enable real-time conversation, and help minimize the social isolation that accompanies profound paralysis.

More information: Guenther FH, Brumberg JS, Wright EJ, Nieto-Castanon A, Tourville JA, et al. (2009) A Wireless Brain-Machine Interface for Real-Time Speech Synthesis. PLoS ONE 4(12): e8218. doi:10.1371/journal.pone.0008218

Citation: Machine Translates Thoughts into Speech in Real Time (2009, December 21) retrieved 10 May 2025 from https://medicalxpress.com/news/2009-12-machine-thoughts-speech-real.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Machine Translates Thoughts into Speech in Real Time

Chatbot accuracy: Study evaluates medical advice from AI chatbots and other sources

Foam treatment rapidly relieves itch and clears stubborn scalp psoriasis lesions

BabyBot: Soft robotic infant mimics feeding behaviors from birth to 6 months old

Minimally invasive disc injection reduces pain and improves function in chronic back pain, feasibility study finds

Natural short sleepers have a genetic mutation, finds new study

Weight-loss drugs cut alcohol intake by almost two-thirds, research in Ireland suggests

Fat-rich fluid found to fuel immune failure in ovarian cancer

Studies point to redlining as a 'perfect storm' for breast cancer

Study shows a virtual nurse can persuade you to get vaccinated

Novel cancer immunotherapy offers new hope for late-stage cancer patients

Fatty liver in pregnancy may increase risk of preterm birth

Cholesterol pill helps those at high risk of heart attack and stroke: Study

Functional bioprinted spinal disks offer new hope for understanding and treating back pain

How a protein and immune patterns in melanoma offer clues to more personalized treatments

Americans' use of illicit opioids is higher than previously reported, survey finds

Zeroing in on the brain's speech 'receiver'

Where the brain makes sense of speech

Researchers shed light on the brain mechanism responsible for processing of speech

Read my lips: Using multiple senses in speech perception (Video)

Our faces, not just our ears 'hear' speech: study

Scientists develop better method for converting sounds to electronic signals

Too fast to see: Eye movements predict speed limits in perception

Unique model of rare epileptic disease helps pinpoint potential treatment route

Brain's 'waste disposal network' is shaped by neural activity, study finds

Exercise enhances stem cell transplant function in Parkinson's disease, study finds

Discovery of dopamine receptors in overlooked brain region sheds light on complex circuitry for anxiety and depression

Neuroscientists pinpoint where (and how) brain circuits are reshaped as we learn new movements

Phys.org

Tech Xplore

Science X

Machine Translates Thoughts into Speech in Real Time

Chatbot accuracy: Study evaluates medical advice from AI chatbots and other sources

Foam treatment rapidly relieves itch and clears stubborn scalp psoriasis lesions

BabyBot: Soft robotic infant mimics feeding behaviors from birth to 6 months old

Minimally invasive disc injection reduces pain and improves function in chronic back pain, feasibility study finds

Natural short sleepers have a genetic mutation, finds new study

Weight-loss drugs cut alcohol intake by almost two-thirds, research in Ireland suggests

Fat-rich fluid found to fuel immune failure in ovarian cancer

Studies point to redlining as a 'perfect storm' for breast cancer

Study shows a virtual nurse can persuade you to get vaccinated

Novel cancer immunotherapy offers new hope for late-stage cancer patients

Fatty liver in pregnancy may increase risk of preterm birth

Cholesterol pill helps those at high risk of heart attack and stroke: Study

Functional bioprinted spinal disks offer new hope for understanding and treating back pain

How a protein and immune patterns in melanoma offer clues to more personalized treatments

Americans' use of illicit opioids is higher than previously reported, survey finds

Related Stories

Zeroing in on the brain's speech 'receiver'

Where the brain makes sense of speech

Researchers shed light on the brain mechanism responsible for processing of speech

Read my lips: Using multiple senses in speech perception (Video)

Our faces, not just our ears 'hear' speech: study

Scientists develop better method for converting sounds to electronic signals

Recommended for you

Too fast to see: Eye movements predict speed limits in perception

Unique model of rare epileptic disease helps pinpoint potential treatment route

Brain's 'waste disposal network' is shaped by neural activity, study finds

Exercise enhances stem cell transplant function in Parkinson's disease, study finds

Discovery of dopamine receptors in overlooked brain region sheds light on complex circuitry for anxiety and depression

Neuroscientists pinpoint where (and how) brain circuits are reshaped as we learn new movements

Newsletter sign up

Donate and enjoy an ad-free experience