Carnegie Mellon releases data on Haitian Creole to hasten development of translation tools

Jan 27, 2010

In response to the humanitarian crisis in Haiti, scientists at Carnegie Mellon University's Language Technologies Institute (LTI) have publicly released spoken and textual data they've compiled on Haitian Creole so that translation tools desperately needed by doctors, nurses and other relief workers on the earthquake-ravaged island can be rapidly developed.

Since Carnegie Mellon began to make the data publicly available last week, a team at Microsoft Research has used it to help develop an experimental, web-based system for translating between English and Haitian Creole (http://www.microsofttranslator.com/).

Translators Without Borders (http://www.tsf-twb.org/), a not-for-profit association based in Paris, plans to distribute a medical triage dictionary to doctors in Haiti once that data has been converted into a readable format. LTI researchers, likewise, have begun working on their own system for Haitian Creole.

Although French is the official language of Haiti and is spoken by elites, Haitian Creole is the most widely spoken language in Haiti, said Robert Frederking, LTI senior systems scientist. Haitian Creole is based on French, but has evolved substantially since Haitians overthrew the French colonists more than 200 years ago. Word meanings have drifted and the language incorporates some African syntax.

"French speakers can sort of puzzle through it, but Creole isn't penetrable if you don't know French," Frederking said. Few translation resources are available for the language, he added.

The Carnegie Mellon data base for Haitian Creole was created in the late 1990s for Diplomat, a project sponsored by the Defense Advanced Research Projects Agency. The project was headed by Jaime Carbonell, LTI director, and focused on developing portable, speech-to-speech translation devices that could be deployed rapidly for Haitian Creole and other languages of special interest to the Department of Defense. Frederking and Alex Rudnicky, principal systems scientist in the Computer Science Department, served as co-principal investigators.

A prototype Haitian Creole translation system was delivered to the U.S. Army, but "as far as we know, nobody ever field-tested it," Frederking said. The project ended in the late 1990s, but LTI retained the data compiled and produced for the project.

Since the Jan. 12 earthquake, LTI researchers decided to begin work on an updated translation system for Haitian Creole that would incorporate the latest translation technologies. To aid other groups pursuing parallel efforts worldwide, they also opted to release the data publicly at www.speech.cs.cmu.edu/haitian/, making it available with minimal restrictions. In addition to the Diplomat material, other data developed by researchers at LTI and elsewhere are being added to the site as they become available.

Given the extreme poverty of Haiti, "nobody is going to make money on a Haitian Creole translator," Frederking said. "But translation systems could be an important tool, both for the relief workers now involved in emergency response and in the long-term as rebuilding takes place."

Explore further: Shopping 'mega-jams' have brought cities to a halt for decades

add to favorites email to friend print save as pdf

Related Stories

New Web-based relief tools emerging to help Haiti

Jan 19, 2010

(AP) -- Hundreds of tech volunteers spurred to action by Haiti's killer quake are adding a new dimension to disaster relief, developing new tools and services for first responders and the public in an unprecedented ...

Evaluations aim to advance translation technology

Jul 23, 2007

Wartime military patrols and civilian encounters can be especially dangerous if neither group understands the other’s language. To help American forces secure critical information and communicate with the local population, ...

Technology comes to the aid of Haiti

Jan 16, 2010

Online maps, mobile phone donations, wikis and a slew of websites are being deployed as telecoms firms, technology giants and startups set aside their rivalries and put the latest tools to work to help earthquake-ravaged ...

Recommended for you

Report: FBI's anthrax investigation was flawed

Dec 19, 2014

The FBI used flawed scientific methods to investigate the 2001 anthrax attacks that killed five people and sickened 17 others, federal auditors said Friday in a report sure to fuel skepticism over the FBI's ...

Study reveals mature motorists worse at texting and driving

Dec 18, 2014

A Wayne State University interdisciplinary research team in the Eugene Applebaum College of Pharmacy and Health Sciences has made a surprising discovery: older, more mature motorists—who typically are better drivers in ...

Napster co-founder to invest in allergy research

Dec 17, 2014

(AP)—Napster co-founder Sean Parker missed most of his final year in high school and has ended up in the emergency room countless times because of his deadly allergy to nuts, shellfish and other foods.

LA mayor plans 7,000 police body cameras in 2015

Dec 16, 2014

Mayor Eric Garcetti announced a plan Tuesday to equip 7,000 Los Angeles police officers with on-body cameras by next summer, making LA's police department the nation's largest law enforcement agency to move ...

User comments : 0

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.