Software improves captioning for those with hearing deficits

October 18, 2017 by Laurel Thomas, University of Michigan

Making sure deaf and hard-of-hearing students get the information presented in class and current academic events requires a lot of advance planning by the students and the offices that serve them.

It's also a bit costly at $150 an hour or more, and even with the best captionists in the business is subject to error. Computerized programs, while able to convert speech in under five seconds, have unusually high error rates.

But software developed by a University of Michigan researcher makes getting real-time captions on-demand possible by engaging multiple, non-expert captionists at the same time.

An article in the current issue of Communication of the ACM (Association for Computing Machinery) reports the success of Scribe, the program that takes content from several less-skilled translators and intelligently forms captions in less than four seconds.

"What we did is tried to essentially democratize this process," said Walter Lasecki, U-M assistant professor of information and of computer science and engineering, who began the work at the University of Rochester. "The trick is to algorithmically combine the efforts of a lot of people."

Currently, a requiring help must notify an office dedicated to serving his or her needs well in advance to request assistance in a class or event. The office then hires a translator at an hourly rate plus travel. These captionists typically are hired for hours at a time.

Oftentimes, the translator is not someone with subject-matter expertise, which Lasecki said could be problematic in a senior-level mechanical engineering course or class with similar advanced content.

Asking several peers or hiring a half-dozen work study students is not only less expensive and easier to manage, especially for events with little notice, but the translation ends up being more accurate, Lasecki said.

On average, people can only type about 10 to 20 percent of what is being said. But when you combine the notes of many people, the picture becomes more complete.

"If we're both typing the same thing, I might miss a word but you might get that word," Lasecki said.

By having numerous note takers, even an incorrect interpretation of the material can usually be sorted out because it's likely more than one person has the same take on it.

"By doing turn-taking and then aggregation, we can actually get a much more reliable signal," he said.

Lasecki said there is still room for improvement. For one, punctuation is challenging. But he hopes one day the program can be helpful to students and university offices that assist them.

Explore further: Facebook Live adds closed captioning for deaf and hard of hearing

More information: Walter S. Lasecki et al. Scribe, Communications of the ACM (2017). DOI: 10.1145/3068663 ,

Related Stories

Recommended for you

Galactic center visualization delivers star power

March 21, 2019

Want to take a trip to the center of the Milky Way? Check out a new immersive, ultra-high-definition visualization. This 360-movie offers an unparalleled opportunity to look around the center of the galaxy, from the vantage ...

Physicists reveal why matter dominates universe

March 21, 2019

Physicists in the College of Arts and Sciences at Syracuse University have confirmed that matter and antimatter decay differently for elementary particles containing charmed quarks.


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.