Norwegian researchers succeed in creating an artificial child's voice
It is very difficult to get a PC to recognise the voice of a child. Equally problematic is using a computer to synthesise speech in a childs voice. Norwegian researchers have found simple, effective solutions to both challenges.
Synthesised speech has grown more and more similar to human speech. Yet children communicating via a speech device are still forced to use a synthetic adult voice, explains Magne Lunde, Managing Director of Media LT, a company developing tools to assist disabled persons.
This drawback was the driver behind a collaborative research project involving MedialT and Lingit, a software company. Together they are developing Norways first synthesised childlike voice.
Using funding granted under the Research Council programme ICT for the disabled (IT Funk), they are putting an entirely new method to the test.
Converting the master voice into a comprehensible childs voice
We start with what is known as a master voice, which is the product of three or four adult speakers recording several thousands of phrases. Then we record a single child reading a smaller number of phrases aloud. We use this recording to modify the master voice, making it sound like a childs voice, relates Torbjørn Nordgård from Lingit. Dr Nordgård is also a professor of linguistics at the University of Nordland.
The phrases recorded by the child have been selected to include a number of the most essential sounds found in Norwegian.
The master voice still carries the intonation, i.e. a phrases melody. The result sounds rather like a child with unusual elocution skills, but its still much better than the voice of an adult, says Mr Nordgård.
Far ahead of the rest of the world
Very little research has been carried out on this subject internationally. MediaLT and Lingets innovative method of synthesising a childs voice is bringing them to the forefront of their field.
Everything is now in place to start testing trial versions of the childs voice.
We hope to have a beta version in place this summer, says Magne Lunde.
PCs need to understand childrens speech
Mr Lunde and his colleagues are also researching voice control such as use of verbal commands to operate a PC.
In order to operate a computer by means of speech, the machine must successfully decipher what is being said. Interpreting the speech of individuals on both the young and the older end of the scale is especially challenging since the distance from their vocal cords to their lips is shorter than that of the average adult.
Teaching a speech recognition program to understand the pronunciation of the various sounds of a language requires a relatively large amount of recorded speech. Unfortunately, insufficient data exist today in terms of actual childrens speech, states Professor Torbjørn Svendsen from the Norwegian University of Science and Technology.
Professor Svendsen and his research partners have come up with a very elegant, yet simple method of overcoming the challenges associated with speech recognition and children they have synthesised childrens voices and used the results to compile a collection of data.
A vast improvement in quality
The length of the vocal tract affects the frequency distribution of the speech energy. The researchers are using technology to render the energy distribution of adult speech so that it more closely resembles that of a child.
The converted adult speech resembles the way children speak in terms of sound as well. Thus, we could apply our conversion technique to a large database of adult speech and generate a functional database of artificial childlike voices. We then used this to train a separate speech recognition program for children, explains Professor Svendsen.
This greatly improved the recognition fidelity of childrens speech. The error rate was reduced by 50 to 70 per cent, he states.
Activities are being carried out in cooperation with the researchers in the Voice control in multimodal dialogue (SMUDI) project, which received funding from the Research Councils Large-scale programme Core Competence and Value Creation in ICT (VERDIKT) and the Ministry of Education and Research.
Norwegian a tough language
The Norwegian language poses a number of especially steep challenges to speech recognition experts.
In general, the degree of variation in any language is large enough to make it difficult to model. But Norwegian is especially tricky; there are two distinct written forms of the language, countless dialects and a wide range of accepted alternatives for words, declensions and compounds. On top of all this, there is no single pronunciation standard, stresses Torbjørn Svendsen.
Dr Svendsen also points out that people can experience considerable difficulty when faced with voice-controlled devices.
This video is not supported by your browser at this time.
A video clip of two Scotsmen using a speech-operated lift illustrates this rather humorously.
It is easy to get caught up in our fascination with speech recognition and the many possibilities it holds, so its important not to replace existing technology when it remains the best option for getting something done like using buttons to operate a lift, he concludes.Provided by The Research Council of Norway
-
From lemons to lemonade: Reaction uses carbon dioxide to make carbon-based semiconductor,
33 comments
-
Thioridazine kills cancer stem cells in human while avoiding toxic side-effects of conventional cancer treatments,
3 comments
-
SpaceX private rocket blasts off for space station (Update),
42 comments
-
Landmark calculation clears the way to answering how matter is formed,
55 comments
-
Research team claims to have found evidence Lake Cheko is impact crater for Tunguska Event,
18 comments
-
Ideas to mitigate risk of 911 calls being misdirected
May 24, 2012
-
Live scribe pen?
May 10, 2012
-
Shallow water flow simulation
May 07, 2012
-
Tablet for taking notes?
May 05, 2012
-
Best fit tablet for me?
May 05, 2012
-
Measure of Informaton
May 04, 2012
- More from Physics Forums - Computing & Technology
More news stories
Browser wars flare in mobile space
The browser wars are heating up again, but this time the fight is for dominance of the mobile Internet.
20 hours ago |
4 / 5 (4) |
3
Probability of contamination from severe nuclear reactor accidents is higher than expected: study
Catastrophic nuclear accidents such as the core meltdowns in Chernobyl and Fukushima are more likely to happen than previously assumed. Based on the operating hours of all civil nuclear reactors and the number ...
Technology / Energy & Green Tech
May 22, 2012 |
3.6 / 5 (25) |
56
|
HyperSolar shows dirty water no barrier to power world
(Phys.org) -- The Santa Barbara, California, company, HyperSolar, is set to transparently share the ups and downs of its research experiences toward the companys ultimate vision, successfully producing ...
Tesla to launch electric sedan in US on June 22
Tesla Motors said Tuesday it would begin deliveries of "the world's first premium electric sedan" on June 22, slightly ahead of schedule.
Technology / Energy & Green Tech
May 22, 2012 |
4.5 / 5 (12) |
18
SpotterRF debuts Radar Backpack Kit (w/ Video)
(Phys.org) -- SpotterRF has announced a special radar backpack kit designed to enhance situational awareness for soldiers on the ground. The company says its special radar is designed for warfighters as part ...
Stunning image of smallest possible five-ringed structure
Scientists have created and imaged the smallest possible five-ringed structure about 100,000 times thinner than a human hair and you'll probably recognise its shape.
'Unzipped' carbon nanotubes could help energize fuel cells, batteries
Multi-walled carbon nanotubes riddled with defects and impurities on the outside could replace some of the expensive platinum catalysts used in fuel cells and metal-air batteries, according to scientists at ...
Change in developmental timing was crucial in the evolutionary shift from dinosaurs to birds: study
At first glance, it's hard to see how a common house sparrow and a Tyrannosaurus Rex might have anything in common. After all, one is a bird that weighs less than an ounce, and the other is a dinosaur that ...
Computer model used to pinpoint prime materials for efficient carbon capture
When power plants begin capturing their carbon emissions to reduce greenhouse gases and to most in the electric power industry, it's a question of when, not if it will be an expensive undertaking.
T cells 'hunt' parasites like animal predators seek prey, study shows
By pairing an intimate knowledge of immune-system function with a deep understanding of statistical physics, a cross-disciplinary team at the University of Pennsylvania has arrived at a surprising finding: T cells use a movement ...
Yale study concludes public apathy over climate change unrelated to science literacy
Are members of the public divided about climate change because they don't understand the science behind it? If Americans knew more basic science and were more proficient in technical reasoning, would public consensus match ...