Stanford computer scientists find Internet security flaw
May 24, 2011 By Melissae Fellet
Postdoctoral researcher Elie Bursztein, left, and John Mitchell, a professor of computer science, with colleagues built a computer program that revealed a security flaw in commercial audio captchas used by major Internet companies.
(PhysOrg.com) -- Researchers at the Stanford Security Laboratory create a computer program to defeat audio captchas on website account registration forms, revealing a design flaw that leaves them vulnerable to automated attacks.
Stanford researchers have found an audible security weakness on the Internet.
If you've ever registered for online access to a website, it's likely you were required as part of the process to correctly read a group of distorted letters and numbers on the screen.
That's a simple test to prove you're a human, not a computer program with malicious intent.
Though computers are good at filling out forms, they struggle to decipher these wavy images crisscrossed with lines, known as captchas (short for Completely Automated Public Turing test to tell Computers and Humans Apart).
But there's a second type of captcha, and it may pose more of a security weakness. These audio captchas, designed to help the visually impaired, require users to accurately listen to a string of spoken letters and/or numbers disguised with background noise.
John Mitchell, a professor of computer science, postdoctoral researcher Elie Bursztein and colleagues built a computer program that could listen to and correctly decipher commercial audio captchas used by Digg, eBay, Microsoft, Yahoo and reCAPTCHA, a company that creates captchas.
The researchers presented their results during a symposium on security and privacy in Oakland, Calif.
The Stanford program, called Decaptcha, successfully decoded Microsoft's audio captcha about 50 percent of the time. It correctly broke only about 1 percent of reCAPTCHA's codes, the most difficult ones of those tested, but even this small success rate is considered trouble for websites such as YouTube and Facebook that get hundreds of millions of visitors each day.
Imagine a large network of malicious computers creating many fake accounts on YouTube. This robot network of accounts could highly rate the same video, falsely increasing its popularity and thereby its advertising revenue. "Bot" networks could also swamp email accounts with spam messages.
Decoding sounds
Computers have a tough time attempting to read image captchas, but Mitchell and Bursztein wondered if audio captchas were safe from automated attacks, too.
The researchers taught their program to recognize the unique sound patterns for every letter of the alphabet, as well as numeral digits. Then they challenged their software to decode audio captchas it had never heard before.
The program worked by identifying the sound shapes in the target captcha file, comparing them to those stored in its memory. It worked the software could to some extent imitate human hearing.
"In the battle of humans versus computers, we lost round one for audio captchas," Bursztein said. "But we have a good idea of what round two should be."
Designing captchas is challenging. The tests must be simple enough for users to answer quickly, yet complicated enough so computers struggle to decipher the patterns. Background noise in an audio captcha can confuse computers, but little is known about the types of noises that trip them up the most.
The researchers generated 4 million audio captchas mixed with white noise, echoes or music, and challenged the program to decode them. After training Decaptcha with some samples, they took it for a test drive.
The program easily defeated captchas mixed with static or repetition, with a 60 to 80 percent success rate, but background music made the task more difficult.
Decaptcha removes the background noise from each audio file, leaving distinctively shaped spikes of energy for each digit or letter in the captcha. The program clearly isolates these spikes from white noise or echoes. But when the captcha contains noises that mimic these energy spikes, Decaptcha is often confused.
Building a program to solve captchas is "an interesting test case for machine learning technology," said Mitchell. "For audio, it's in a realm where machines should do better than humans."
Add meaning
And they do, until they have to think like us. Music lyrics or garbled voices are forms of semantic noise sounds that carry meaning. Humans can recognize a message mixed with semantic noise, but computers can't distinguish the two clearly. Decaptcha correctly solved only about 1 percent of these captchas.
Of the commercial captchas the team tested, reCAPTCHA was the strongest because it contains background conversation and other semantic noise. Microsoft and Digg have recently changed their audio captchas to use this technology, Bursztein said. But the creation of this latest captcha cracker shows that even the best approach isn't secure enough. "The replacement technology isn't there yet, but we've pinpointed the problem," he said.
Citing data obtained from eBay, the researchers say about 1 percent of people who register at the site use audio captchas. That's enough users to warrant an effort to strengthen this security device.
The researchers suggest programmers tap into our human ability to understand meaning in sounds to improve future captchas. More secure puzzles could include background music or entire words instead of a string of letters. But the team cautions that programmers need to keep the human user in mind. If the captcha is too complicated, legitimate users won't be able to decode it.
Despite efforts to strengthen audio captchas against computer attacks, they will, like visual captchas, still be vulnerable to crowdsourced attacks by a group of people manually solving captchas for low wages.
Captchas are vital to freedom on the Internet, the researchers say, as the value of many social media sites depends on the assumption that fellow users are humans.
"Captchas are a big inconvenience to people," Mitchell said. "The fact that they're so widely used is evidence of their necessity."
Provided by
Stanford University
-
From lemons to lemonade: Reaction uses carbon dioxide to make carbon-based semiconductor,
32 comments
-
Thioridazine kills cancer stem cells in human while avoiding toxic side-effects of conventional cancer treatments,
3 comments
-
SpaceX private rocket blasts off for space station (Update),
42 comments
-
Climate scientists say they have solved riddle of rising sea,
31 comments
-
Research team claims to have found evidence Lake Cheko is impact crater for Tunguska Event,
18 comments
-
Ideas to mitigate risk of 911 calls being misdirected
May 24, 2012
-
Live scribe pen?
May 10, 2012
-
Shallow water flow simulation
May 07, 2012
-
Tablet for taking notes?
May 05, 2012
-
Best fit tablet for me?
May 05, 2012
-
Measure of Informaton
May 04, 2012
- More from Physics Forums - Computing & Technology
More news stories
Browser wars flare in mobile space
The browser wars are heating up again, but this time the fight is for dominance of the mobile Internet.
1 hour ago |
not rated yet |
0
SpotterRF debuts Radar Backpack Kit (w/ Video)
(Phys.org) -- SpotterRF has announced a special radar backpack kit designed to enhance situational awareness for soldiers on the ground. The company says its special radar is designed for warfighters as part ...
Probability of contamination from severe nuclear reactor accidents is higher than expected: study
Catastrophic nuclear accidents such as the core meltdowns in Chernobyl and Fukushima are more likely to happen than previously assumed. Based on the operating hours of all civil nuclear reactors and the number ...
Technology / Energy & Green Tech
May 22, 2012 |
3.6 / 5 (21) |
56
|
HyperSolar shows dirty water no barrier to power world
(Phys.org) -- The Santa Barbara, California, company, HyperSolar, is set to transparently share the ups and downs of its research experiences toward the companys ultimate vision, successfully producing ...
Tesla to launch electric sedan in US on June 22
Tesla Motors said Tuesday it would begin deliveries of "the world's first premium electric sedan" on June 22, slightly ahead of schedule.
Technology / Energy & Green Tech
May 22, 2012 |
4.5 / 5 (11) |
18
Nvidia trumpets Tegra 3 phone design wins for 2012
(Phys.org) -- Nvidias competitive war paint has a name, Tegra 3. On the heels of Nvidia announcements about lowering costs of its Tegra 3 processors and Nvidia-enabled tablets running Android Ice Cream ...
Scientist: Evolution debate will soon be history
(AP) -- Richard Leakey predicts skepticism over evolution will soon be history. Not that the avowed atheist has any doubts himself.
Dell tablet leak: 10.1-inch display, two-battery choice
(Phys.org) -- Headline after headline talks about vendors tablets in the wings as likely number-one contenders for the iPad. Such claims have justifiably been taken with a grain of salt, considering ...
SpaceX capsule has 'new car' smell, astronauts say (Update)
SpaceX's Dragon cargo vessel smells like a new car, said astronauts at the International Space Station after opening the hatches Saturday following the spacecraft's landmark mission to the orbiting lab.
Thousands of shellfish found dead in Peru
Thousands of crustaceans were found dead off the coast of Lima following the mystery mass death of dolphins and pelicans, the Peruvian Navy said Friday.
Australia hails surprise super-telescope decision
Australia has hailed a surprise decision giving it a role in a radio telescope project aimed at revolutionising astronomy, vowing to draw on its decades of experience in space science.
May 24, 2011
Rank: not rated yet
May 24, 2011
Rank: not rated yet
May 24, 2011
Rank: 5 / 5 (2)
"tihs cchapta is rdedalbe by hmunas' just needs to have the letters rearranged. How many ways can "rdedalbe" be rearranged?
It works because the first and last letters are correct - just typing 'redadalbe' into google gives you the result: "Did you mean, readable?"
What kind of sentence would you construct that humans could decipher than a computer could not?
May 24, 2011
Rank: not rated yet
A couple of ideas - perhaps they could use sound-bending techniques to randomly warp the sound of the letter as it is being pronounced. It is like having unlimited accents. Our brains can understand it, but a computer can't match it (yet).
Another idea is to use a simple question and answer type audio captcha. "What is one plus three?" or "Is a dog an animal or fruit?" Until IBM's Watson computer can be made into an affordable machine for the masses, this type of question/answer is easy for us, but beyond a computer.
May 24, 2011
Rank: 5 / 5 (2)
May 24, 2011
Rank: not rated yet
Humans have not lost round 1 as Bursztein so arrogantly states. The 1% success rate breaking the recaptcha audio system clearly show this. There is still plenty of room for improvement with the captcha systems and we can add a problem solving test beside the image/audio captcha.