June 15, 2010

Speech Synthesizer Helps Movie Critic

By Phillip F. Schewe, Inside Science News Service

The voices you hear on message services are often created artificially by fitting together short audio snippets from a large library of vocalized words and sounds. Scientists are now moving beyond the older generic voices to produce what, to many listeners, sounds like an "actual" person speaking.

Voice synthesis technology had a recent success when a personalized text-to-speech system was crafted for the movie critic Roger Ebert, who suffers from thyroid cancer and is unable to speak any longer. The synthetic Ebert speech was created from many hours of his voice recorded during past television programs. The company that made Ebert’s system, CereProc, is able to make TTS conversions quickly, allowing them to provide more than just a generic-voice synthesis.

Matthew Aylett, one of the founders of CereProc, said that he and his colleagues do not necessarily aim for smooth voicings. Instead, he said, "We want the variation which gives a voice a fresh and natural feel to it. This means not getting our voice talents to speak in a boring and neutral way but capturing their more conversational speech style."

At CereProc they are even able to build a certain amount of emotion into their voices as well creating regional accents. The company, located in Edinburgh, Scotland, has been successful, for example, in reproducing Scottish and Irish accented speech.

For "cloning" voices of specific people (they did one for President George Bush, for example), he said they sometimes have resort to "found" recordings. The trouble with using snippets of voice is that these recordings were often made under a variety of acoustic conditions, which then have to be corrected in making a final voice.

Roger Ebert learned about the President Bush "voice" and asked if CereProc could assemble a voice for him using the large inventory of audio recordings of his television show. So far they have used about five hours of Ebert’s voice to produce a voice bank of about 300,000 phonetic sounds.

Timothy Bunnell, a scientist at Nemours Biomedical Research in Wilmington, Del., is working to make personalized voices accessible to everyone, especially for those with neurodegenerative diseases.

Bunnell's voice synthesis system can be prepared for people who still have the power of speech. By contrast, Aylett’s synthetic speech program, such as that for Roger Ebert, is based on recordings of people who can no longer speak.

Carrying out text-to-speech synthesis for children is more difficult. "It is difficult for young children to record enough speech with the required degree of consistency and precision needed to build a high-quality synthetic voice," said Bunnell. The main problem, he said, is not with the nature of the utterances, but with the amount of variability in children's speech, much more than for adults.

Aylett enjoys synthesis research. "It's fun to use the synthesizer, but it’s even better to see it helping people who really need it."

Provided by Inside Science News Service

Citation: Speech Synthesizer Helps Movie Critic (2010, June 15) retrieved 20 September 2024 from https://phys.org/news/2010-06-speech-movie-critic.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

An average voice is beautiful, say scientists

0 shares

Feedback to editors

Oceanic life found to be thriving thanks to Saharan dust blown from thousands of kilometers away

1 hour ago

New material with wavy layers of atoms exhibits unusual superconducting properties

8 hours ago

Researchers build AI model database to find new alloys for nuclear fusion facilities

8 hours ago

Greylag geese with similar personalities have higher hatching success, study suggests

8 hours ago

Can captive tigers be part of the effort to save wild populations?

9 hours ago

Proteins in tooth enamel offer window into ancient and modern human wellness

10 hours ago

Mysteries of the bizarre 'pseudogap' in quantum physics finally untangled

10 hours ago

Are cows pickier than goats? Answers from innovative large-scale feeding experiments from 275 years ago

11 hours ago

Research predicts rise in tropical hydraulic failure

11 hours ago

Human genome stored on 'everlasting' memory crystal

11 hours ago

Load comments (0)

Speech Synthesizer Helps Movie Critic

Oceanic life found to be thriving thanks to Saharan dust blown from thousands of kilometers away

New material with wavy layers of atoms exhibits unusual superconducting properties

Researchers build AI model database to find new alloys for nuclear fusion facilities

Greylag geese with similar personalities have higher hatching success, study suggests

Can captive tigers be part of the effort to save wild populations?

Proteins in tooth enamel offer window into ancient and modern human wellness

Mysteries of the bizarre 'pseudogap' in quantum physics finally untangled

Are cows pickier than goats? Answers from innovative large-scale feeding experiments from 275 years ago

Research predicts rise in tropical hydraulic failure

Human genome stored on 'everlasting' memory crystal

Relevant PhysicsForums posts

Container shrinks at certain screen widths (CSS)

Unsolvable python code bug? (finding the difference between two input strings)

User-Defined Functions in Sql Server SSMS

Can Fortran 77 Code Be Used to Debug Python Code for Solving ODEs Using Radau5?

Help solving a geometrical matching issue with Graph Neural Networks

Zipping identical iterables

An average voice is beautiful, say scientists

Company unveils improved voice processors

Teacher talk strains voices, especially for women

NEC Develops Technologies that Assess Author's Feelings from Text

Nuance buys British voice-to-text company SpinVox

Showing the Mechanics of Making Music

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Medical Xpress

Tech Xplore

Science X

Speech Synthesizer Helps Movie Critic

Oceanic life found to be thriving thanks to Saharan dust blown from thousands of kilometers away

New material with wavy layers of atoms exhibits unusual superconducting properties

Researchers build AI model database to find new alloys for nuclear fusion facilities

Greylag geese with similar personalities have higher hatching success, study suggests

Can captive tigers be part of the effort to save wild populations?

Proteins in tooth enamel offer window into ancient and modern human wellness

Mysteries of the bizarre 'pseudogap' in quantum physics finally untangled

Are cows pickier than goats? Answers from innovative large-scale feeding experiments from 275 years ago

Research predicts rise in tropical hydraulic failure

Human genome stored on 'everlasting' memory crystal

Relevant PhysicsForums posts

Related Stories

An average voice is beautiful, say scientists

Company unveils improved voice processors

Teacher talk strains voices, especially for women

NEC Develops Technologies that Assess Author's Feelings from Text

Nuance buys British voice-to-text company SpinVox

Showing the Mechanics of Making Music

Recommended for you

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Newsletter sign up

Donate and enjoy an ad-free experience