Google launches next phase of voice-recognition project

December 14, 2010 By Mike Swift

Google launches next phase of voice-recognition project

Enlarge

Google on Tuesday switched on a new program that will dramatically improve the accuracy of its speech recognition service, which allows people to use verbal commands to search the Internet, send an e-mail or post a Facebook update.

That's of growing importance to the Mountain View, Calif., search giant, which sees Internet searches on smart phones as a significant part of its business. While the company doesn't disclose specific numbers, one in four searches on Android devices are now done by voice, and the search volume on Android phones climbed by 50 percent in the first six months of 2010.

"A lot of the world's information is spoken, and if Google's mission is to organize the world's information, it needs to include the world's spoken information," said Mike Cohen, who heads the company's speech efforts.

Users of the latest Android-powered smart phones can now allow Google to recognize the unique pattern of their speech by downloading a new app from the Android Market. The service gradually learns the patterns of a person's speech and eventually will more accurately understand their .

Google's ambitions don't stop at improving voice recognition. Its recent purchase of Phonetic Arts, a British company that specializes in speech output, highlights Google's plans to allow your computer or smart phone to speak back to you, in a "voice" that will sound increasingly natural, and even human.

Google earns the vast majority of its revenue through , and expects a majority of its Internet business to flow through and other wireless devices in the future, so high quality voice services are of critical importance.

The linguistic models that Cohen's team has helped develop over the past six years at Google, based on more than 230 billion searches typed into google.com and speech inflections recorded from millions of people who used voice search, are now so vast and complex that it would literally take several centuries for a single PC to create Google's digital model of spoken English.

For Google, said Al Hilwa, an analyst with the research firm IDC, "voice is a critical strategic competence."

Google's Dec. 3 announcement of the Phonetic Arts acquisition - terms were not disclosed - is "complementary to what Google is doing in social networking, video and mobile where it should be possible for people on the go to talk to their mobile devices, search engines or social networks as an alternative mechanism of interaction," Hilwa said.

Speech also is another key area where Google competes with Microsoft, which purchased Mountain View-based Tellme Networks, Inc., in 2007 to bulk up its speech services, and which also offers voice search through its Bing search engine.

Before joining Google, Cohen co-founded Nuance Communications, a Menlo Park, Calif., speech technology company, in 1994. Cohen, a part-time jazz guitarist whose first memory was the sound of the toy piano he received for Hanukkah in the Brooklyn apartment where he grew up, has been a research scientist in the field of using computer speech technology for more than a quarter century.

A bespectacled man with red hair who once worked as a piano tuner and whose sextet once played the Montreux Jazz Festival in Switzerland, Cohen likes to laugh at the irony of placing a native Brooklynite - given the New York borough's famously stretched and tangled dialect - in charge of speech recognition.

"I'm from Brooklyn," Cohen said. "I've never parked my car; I only 'paahhk' my car."

A person's accent, Cohen said, is one of the most difficult challenges for speech recognition services, and is one problem that the new personal service should help overcome.

Still, much more than understanding different accents are involved in human speech, and a wide range of factors - variables include the shape of a person's mouth, teeth and throat, the cadence and pitch of their sentences - are elements the human ear has evolved to differentiate, but which computers have not.

"It's all different from one person to another, and that all affects the sounds that come out," Cohen said. "There is tremendous variation between individuals. It's been a known thing that you can do better (at ) if you can do something to try to adapt to an individual's speech patterns."

More information: http://googlemobil … ersonal.html

(c) 2010, San Jose Mercury News (San Jose, Calif.).
Distributed by McClatchy-Tribune Information Services.

Filter


Move the slider to adjust rank threshold, so that you can hide some of the comments.


Display comments: newest first

I_Dont_Have_A_Name
Dec 14, 2010

Rank: 3.6 / 5 (7)
Look if it's anything like the Windows one I'm not going to take it seriously. They started out with "dear mom" and it wrote "Dear aunt." that was 2007....maybe google can pull it off better then microsoft.
CreepyD
Dec 14, 2010

Rank: 5 / 5 (2)
Do we have a way of dialing a number yet without a single touch of the phone? Will this advance allow that possibly?
In theory it's a simple task of assigning your own voice to numbers, but you can't activate speech recognition without touching the phone! No good while driving when it's needed most.
stealthc
Dec 14, 2010

Rank: 1.8 / 5 (9)
coming soon, the next phase. Google will leave it's voice monitoring software on in order to spy on you.
winthrom
Dec 14, 2010

Rank: 4.7 / 5 (6)
HAL: "I'm sorry dave, I can't do that"
Wasabi
Dec 14, 2010

Rank: 2.3 / 5 (3)
Do we have a way of dialing a number yet without a single touch of the phone? Will this advance allow that possibly?
In theory it's a simple task of assigning your own voice to numbers, but you can't activate speech recognition without touching the phone! No good while driving when it's needed most.


Probably the easiest implementation of that would be key word or key phrase activation. A background process monitors voice audio input and triggers when the key word or phrase is spoken.
KomMaelstrom
Dec 15, 2010

Rank: 5 / 5 (1)
coming soon, the next phase. Google will leave it's voice monitoring software on in order to spy on you.

There isn't any viable reason they would do that, unless you've attracted attention to yourself and can be distinguished from nearly a billion other people. Sounds like a lot of wasted hard drive space to me.
LivaN
Dec 15, 2010

Rank: not rated yet
coming soon, the next phase. Google will leave it's voice monitoring software on in order to spy on you.

There isn't any viable reason they would do that, unless you've attracted attention to yourself and can be distinguished from nearly a billion other people. Sounds like a lot of wasted hard drive space to me.


Are you serious? Do you not know why we have so many privacy laws regarding personal data? Would you not mind if facebook sells your user data to third party organisations? But of course, it doesn't matter because unless you're unique, it's a waste of space.

Don't you even know google's goal?
Webewitch
Dec 15, 2010

Rank: 5 / 5 (1)
Here's a funny video showing how Voice Recognition copes - or doesn't with a Scottish accent:
http://www.youtub...cHhA7M-Y
JimB135
Dec 15, 2010

Rank: not rated yet
Look if it's anything like the Windows one I'm not going to take it seriously. They started out with "dear mom" and it wrote "Dear aunt." that was 2007....maybe google can pull it off better then microsoft.


I use Google voice search on my iPhone and it's scary accurate. Voice commands on my old blackberry were a novelty and pretty much useless. Google is way beyond novelty. I can imagine a world in the not to distant future that we will be conversing with our electronic devices as if they were another person.

Just think -- between this and facebook we will never have to talk to an actual face to face human being again. Billions of social misfits on the planet.
LuckyBrandon
Dec 19, 2010

Rank: not rated yet
Do we have a way of dialing a number yet without a single touch of the phone? Will this advance allow that possibly?
In theory it's a simple task of assigning your own voice to numbers, but you can't activate speech recognition without touching the phone! No good while driving when it's needed most.

there are already apps for that on the market to run even on old windows mobile 5 (I had it on windows mobile 5, 6, and 6.5). i believe windows phone 7 is still awaiting an app, but it'll come along sooner than later.

Look if it's anything like the Windows one I'm not going to take it seriously. They started out with "dear mom" and it wrote "Dear aunt." that was 2007....maybe google can pull it off better then microsoft.

try 2001/2002....its much better now...
frajo
Dec 19, 2010

Rank: not rated yet
Nice to see that a capable company follows in the footsteps of IBM's 1996 Warp4 which already had a voice recognition system built-in. One had to train it exactly the way described in the article:
The service gradually learns the patterns of a person's speech and eventually will more accurately understand their voice commands.
Quantum_Conundrum
Dec 19, 2010

Rank: not rated yet
Nice to see that a capable company follows in the footsteps of IBM's 1996 Warp4 which already had a voice recognition system built-in. One had to train it exactly the way described in the article:
The service gradually learns the patterns of a person's speech and eventually will more accurately understand their voice commands.


Yah don't suppose there might be a lot of nuances that this one patheticly tiny paragraph doesn't accurately explain or describe eh? Nah....couldn't be the case...

I've never even used this service. Didn't know it existed, because every tme I go to Google I just automaticly type in a search. I never even look at anything else on Google, except a few months ago the mapping technology, but uh, yeah....never used it, never needed it...
Rank 5 /5 (10 votes)
Relevant PhysicsForums posts

More news stories

Yahoo kills 'Livestand' just 6 months after debut

(AP) -- Yahoo is killing a tablet magazine called Livestand just six months its debut on the iPad.

Technology / Business

created 13 hours ago | popularity not rated yet | comments 1

Computers excel at identifying smiles of frustration (w/ Video)

(Phys.org) -- Researchers at the Massachusetts Institute of Technology (MIT) in the US have trained computers to recognize smiles, and they have turned out to be more adept at recognizing smiles of frustration ...

Technology / Computer Sciences

created May 25, 2012 | popularity 4 / 5 (2) | comments 1 | with audio podcast report

Yahoo! ditches digital newsstand for iPads

Yahoo! shuttered its fledgling digital newsstand for iPads on Friday in what it said was the start of a product purge intended to make the floundering Internet pioneer more nimble.

Technology / Internet

created 14 hours ago | popularity not rated yet | comments 0

Facebook IPO debacle raises investor dander

The spate of complaints and investigations over the Facebook stock offering suggests big institutions had an edge over small investors, raising questions about the process.

Technology / Business

created 15 hours ago | popularity not rated yet | comments 0

Apple CEO Cook gives up $75M in stock dividends

(AP) -- Apple CEO Tim Cook is giving up $75 million in dividends on restricted stock that the company is awarding to all of its employees.

Technology / Business

created 18 hours ago | popularity 1.8 / 5 (4) | comments 2


Of mice and mental models: Neuroscientific implications of risk-optimized behavior in the mouse

(Medical Xpress) -- Regardless of an organism’s biological complexity, every encephalized animal continuously makes under-informed behavioral choices that can have serious consequences. Despite its ubiquity, ...

Dragon arrives at space station in historic 1st (Update 2)

The privately bankrolled Dragon capsule made a historic arrival at the International Space Station on Friday, triumphantly captured by astronauts wielding a giant robot arm.

Landmark calculation clears the way to answering how matter is formed

(Phys.org) -- An international collaboration of scientists, including Thomas Blum, associate professor of physics, is reporting in landmark detail the decay process of a subatomic particle called a kaon – ...

High-speed method to aid search for solar energy storage catalysts

Eons ago, nature solved the problem of converting solar energy to fuels by inventing the process of photosynthesis.

It's in the genes: Research pinpoints how plants know when to flower

Scientists believe they've pinpointed the last crucial piece of the 80-year-old puzzle of how plants "know" when to flower.

Researchers solve structure of human protein critical for silencing genes

In a study published in the journal Cell on May 24, Cold Spring Harbor Laboratory (CSHL) scientists describe the three-dimensional atomic structure of a human protein bound to a piece of RNA that "guides" the pr ...