uComp research project delivers first results under open source license

Jun 28, 2013

Methods to extract knowledge from social media intelligently and automatically are currently being developed at MODUL University Vienna - and the latest advances have just been published in preparation of an international conference. These advances come in the form of an open source tool to collect and process publicly available social media information.

The tool supports text acquisition, , detection of phonetic similarities, as well as the standardized integration and archiving of the captured information. The represents a major step forward in the uComp project of MODUL University Vienna (Austria) and its European partners. Using the domain of as an example, the project combines cutting-edge methods to automatically capture information from complex sources and combine it with collective human intelligence in the tradition of the "wisdom of the crowds".

The internet is very different from a well-structured database. Unlike libraries or large corporate archives, online information is fragmented and disordered, which makes it difficult to extract knowledge automatically. The emergence of social media has further complicated the process. It is difficult to determine the specific context of a posting, and the use of slang, dialects or foreign words challenges existing tools for text analysis. Scientists and researchers are currently working on solving this problem in the uComp project jointly conducted by MODUL University Vienna and partner organizations from Austria, England and France. After only six months, first results have now been published in preparation of the 7th International Conference for Knowledge Capture (K-Cap 2013) in Banff, Canada.

Man/machine symbiosis

The objective of uComp is explained by the head of the Department of New Media Technology at MODUL University Vienna, Prof. Arno Scharl, using the domain of climate change as a use case: "Millions of people express their opinions in social media, but with conventional methods we are unable to determine the collective mood expressed in social media in real time. We do not know which aspects move people, mobilize people or stimulate their thoughts. The technologies from the uComp project provide us with better ways to capture opinions - on a global basis, irrespective of language barriers, national borders and cultural differences."

The key aspect of uComp for Prof. Scharl, who also serves as the project's Technical Director, is the combination of collective human intelligence and automated knowledge extraction by software tools. The first step to achieving this vision has successfully been taken with the "extensible Web Retrieval Toolkit" (eWRT), which has now been published in a scientific paper. As an open source tool, eWRT promotes a transparent approach to analyzing data from platforms. The system captures data from many different public sources and accurately identifies the language of the gathered information items. Additional functions include the ability to archive large volumes of data, including the management and normalization of relevant metadata (= data that describes the structure and content of documents).

The next two-and-a-half years will focus on using collective for the analysis and validation of data gathered with eWRT. Games with a purpose represent a promising approach in the field of human computation (HC). Examples include online games for classifying documents or for evaluating automatic translations. By aiming to integrate such games into a comprehensive framework to identify complex knowledge patterns, the uComp project is entering unknown digital territory. As Prof. Scharl explains, "We are currently investigating ways of engaging people and providing incentives for participants to share their knowledge. At the same time we need to evaluate the reliability of their contributions, prevent manipulation and assess the quality of results. The uComp project will advance the state of the art by offering all these capabilities in an integrated, reusable framework."

The uComp project and the collaboration of Prof. Scharl's team with fellow researchers from England, France and Austria continue a successful tradition. The DIVINE project, funded by the Austrian Research Promotion Agency (FFG) and the Austrian Ministry for Transport Information and Technology (BMVIT), has already addressed important aspects on the dynamic integration and visualization of information spaces and made major contributions to the development of the eWRT software package.

Explore further: Spanish scientists create algorithms to measure sentiment on social networks

More information: www.ucomp.eu/

add to favorites email to friend print save as pdf

Related Stories

Making online translation accurate, reliable and efficient

Jun 13, 2013

European cooperation is based on our ability to understand each other. Given that there are presently 23 official EU languages, the availability of online tools to facilitate accurate translation is fundamentally ...

Good, better, best practices in terminology

May 13, 2013

A team of translation scholars from the Department for Translation Studies at the University of Vienna, in collaboration with terminologists from the Austrian Parliamentary Administration, the European Academy ...

An active approach to digital archives

Jun 24, 2013

Archiving has long been considered a passive process: put the things you want to keep in a cool, dry place and forget about them until needed. But in the digital era, in which photos, videos, documents and ...

Effective privacy protection in social networks

Jun 05, 2013

Researchers are working on new methods to help them gain a better understanding of the usage habits of participants in social networks. The results will be incorporated in the development of userfriendly ...

Recommended for you

Coping with floods—of water and data

Dec 19, 2014

Halloween 2013 brought real terror to an Austin, Texas, neighborhood, when a flash flood killed four residents and damaged roughly 1,200 homes. Following torrential rains, Onion Creek swept over its banks and inundated the ...

User comments : 0

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.