Smart listeners and smooth talkers
Sounds recognised from an audio recording. Credit: Xunying Liu
Human-like performance in speech technology could be just around the corner, thanks to a new research project that links three UK universities.
Human conversation is rich and its messy. When we communicate, we constantly adjust to those around us and to the environment were in; we leave words out because the context provides meaning; we rush or hesitate, or change direction; we overlap with other speakers; and, crucially, were expressive.
No wonder then that its proved so challenging to build machines that interact with people naturally, with human-like performance and behaviour.
Nevertheless there have been remarkable advances in speech-to-text technologies and speech synthesizers over recent decades. Current devices speed up the transcription of dictation, add automatic captions to video clips, enable automated ticket booking and improve the quality of life for those requiring assistive technology.
However, todays speech technology is limited by its lack of ability to acquire knowledge about people or situations, to adapt, to learn from mistakes, to generalise and to sound naturally expressive. To make the technology more usable and natural, and open up a wide range of new applications, requires field-changing research, explained Professor Phil Woodland of Cambridges Department of Engineering.
Along with scientists at the Universities of Edinburgh and Sheffield, Professor Woodland and colleagues Drs Mark Gales and Bill Byrne have begun a five-year, £6.2 million project funded by the Engineering and Physical Sciences Research Council to provide the foundations of a new generation of speech technology.
Complex pattern matching
Speech technology systems are based on powerful techniques that are capable of learning statistical models known as Hidden Markov Models (HMMs). Trained on large quantities of real speech data, HMMs model the relationship between the basic speech sounds of a language and how these are realised in audio waveforms.
Its a complex undertaking. For speech recognition, the system must work with a continuous stream of acoustic data, with few or no pauses between individual words. To determine where each word stops and starts, HMMs attempt to match the pattern of successive sounds (or phonemes) to the systems built-in dictionary, assigning a probability score as to which sounds are most likely to follow the first sound to complete a word. The system then takes into account the structure of the language and which word sequences are more likely than others.
Adapt, train and talk
A key focus for the new project is to build systems that are adaptive, enabling them to acclimatise automatically to particular speakers and learn from their mistakes. Ultimately, the new systems will be able to make sense of challenging audio clips, efficiently detecting who spoke what, when and how.
Unsupervised training is also crucial, as Professor Woodland explained: Systems are currently pre-trained with the sort of data they are trying to recognise so a dictation system is trained with dictation data but this is a significant commercial barrier as each new application requires specific types of data. Our approach is to build systems that are trained on a very wide range of data types and enable detailed system adaptation to the particular situation of interest. To access and structure the data, without needing manual transcripts, we are developing approaches that allow the system to train itself from a large quantity of unlabelled speech data.
One very interesting aspect of the work is that the fundamental HMMs are also generators of speech, and so the adaptive technology underlying speech recognition is also being applied to the development of personalised speech synthesis systems, added Professor Woodland. New systems will take into account expressiveness and intention in speech, enabling devices to be built that respond to an individuals voice, vocabulary, accent and expressions.
The three university teams have already made considerable contributions to the field and many techniques used in current speech recognition systems were developed by the engineers involved in the new project. The new programme grant enables them to take a wider vision and to work with companies that are interested in how speech technology could transform our lives at home and at work. Applications already planned include a personalised voice-controlled device to help the elderly to interact with control systems in the home, and a portable device to enable users to create a searchable text version of any audio they encounter in their everyday lives.
Provided by
University of Cambridge
-
From lemons to lemonade: Reaction uses carbon dioxide to make carbon-based semiconductor,
32 comments
-
Thioridazine kills cancer stem cells in human while avoiding toxic side-effects of conventional cancer treatments,
3 comments
-
SpaceX private rocket blasts off for space station (Update),
42 comments
-
Climate scientists say they have solved riddle of rising sea,
31 comments
-
SpaceX capsule has 'new car' smell, astronauts say (Update),
4 comments
-
Ideas to mitigate risk of 911 calls being misdirected
May 24, 2012
-
Live scribe pen?
May 10, 2012
-
Shallow water flow simulation
May 07, 2012
-
Tablet for taking notes?
May 05, 2012
-
Best fit tablet for me?
May 05, 2012
-
Measure of Informaton
May 04, 2012
- More from Physics Forums - Computing & Technology
More news stories
Browser wars flare in mobile space
The browser wars are heating up again, but this time the fight is for dominance of the mobile Internet.
15 hours ago |
5 / 5 (2) |
3
Probability of contamination from severe nuclear reactor accidents is higher than expected: study
Catastrophic nuclear accidents such as the core meltdowns in Chernobyl and Fukushima are more likely to happen than previously assumed. Based on the operating hours of all civil nuclear reactors and the number ...
Technology / Energy & Green Tech
May 22, 2012 |
3.6 / 5 (25) |
56
|
HyperSolar shows dirty water no barrier to power world
(Phys.org) -- The Santa Barbara, California, company, HyperSolar, is set to transparently share the ups and downs of its research experiences toward the companys ultimate vision, successfully producing ...
SpotterRF debuts Radar Backpack Kit (w/ Video)
(Phys.org) -- SpotterRF has announced a special radar backpack kit designed to enhance situational awareness for soldiers on the ground. The company says its special radar is designed for warfighters as part ...
Tesla to launch electric sedan in US on June 22
Tesla Motors said Tuesday it would begin deliveries of "the world's first premium electric sedan" on June 22, slightly ahead of schedule.
Technology / Energy & Green Tech
May 22, 2012 |
4.5 / 5 (12) |
18
Stunning image of smallest possible five-ringed structure
Scientists have created and imaged the smallest possible five-ringed structure about 100,000 times thinner than a human hair and you'll probably recognise its shape.
'Unzipped' carbon nanotubes could help energize fuel cells, batteries
Multi-walled carbon nanotubes riddled with defects and impurities on the outside could replace some of the expensive platinum catalysts used in fuel cells and metal-air batteries, according to scientists at ...
Change in developmental timing was crucial in the evolutionary shift from dinosaurs to birds: study
At first glance, it's hard to see how a common house sparrow and a Tyrannosaurus Rex might have anything in common. After all, one is a bird that weighs less than an ounce, and the other is a dinosaur that ...
Computer model used to pinpoint prime materials for efficient carbon capture
When power plants begin capturing their carbon emissions to reduce greenhouse gases and to most in the electric power industry, it's a question of when, not if it will be an expensive undertaking.
T cells 'hunt' parasites like animal predators seek prey, study shows
By pairing an intimate knowledge of immune-system function with a deep understanding of statistical physics, a cross-disciplinary team at the University of Pennsylvania has arrived at a surprising finding: T cells use a movement ...
Land and sea species differ in climate change response: study
(Phys.org) -- Marine and terrestrial species will likely differ in their responses to climate warming, new research by Simon Fraser University and Australia’s University of Tasmania has found.