Listen, watch, read -- computers search for meaning

Oct 30, 2009

(PhysOrg.com) -- European researchers have created the first integrated semantic search platform that integrates text, video and audio. The system can 'watch' films, 'listen' to audio and 'read' text to find relevant responses to semantic search terms. At last, computers are able to look for meaning in our multimedia searches.

There is a phenomenal amount of content out there on the internet, but therein lies a problem. Sure, text content can be skimmed or glanced, but audiovisual content has to be viewed in linear time. It is very complex to search inside a film or audio recording for relevant information.

But European researchers in the MESH project have developed an integrated platform which they say, for the first time, can combine semantic search - or search by the meaning of the words - and a host of associated tools to deliver more relevant information, from a wide variety of sources that can be accessed from an individual user.

The platform can search annotated files from any type of media - photographs, videos, sound recordings, text, document scans - using a host of techniques including optical character recognition, automated speech recognition and automatic annotation of movies and photographs that track salient concepts.

Technology shift

This represents an emerging paradigm shift in .

Here is why. Right now, text in computing is defined by a series of numbers, most commonly the Unicode standard. Each number signifies a particular letter, and computers can scan these codes very quickly. So when you enter a search term, the machine has no idea what those letters signify. It simply looks for the pattern - it has no inkling of the concept behind the pattern.

But in semantic search, every bit of information is defined by potentially dozens of meaningful concepts. When a copywriter invoices for his or her work, for example, the date could be defined in terms of calendar, invoice, billing period, and so on. All these definitions for one piece of information are called ‘metadata’, or information about information.

Collections of agreed metadata terms for a particular field or task, like medicine or accounting, are called ontologies.

So the computer not only searches for the term, it searches for related metadata that defines types of information in specific ways. In reality, the computer still does not ‘understand’ a concept in its semantic search - it continues to look for patterns of letters. But because the concepts behind the search terms are included, it can return results based on concepts as well as text patterns.

Imminent domains

These technologies are becoming common in particular knowledge domains, and more are emerging every day, but most relate to the concepts behind text-based documents. The MESH platform sought to use for every type of media.

On the way, it created some cutting-edge technology. “Our automatic annotation for video, for example, is state of the art,” explains Pedro Concejero, coordinator of the MESH project.

“The annotation system is capable of identifying the general scene setting, such as whether a video is a studio shot or a shot recorded on location. With adequate training, it can also detect (within some error margins) the general topic of the video, such as a scene about an earthquake or a flood. It can also find a number of salient objects within the scene, such as persons or fire, but cannot yet identify consistently objects with great variations in shape or aspect.”

One of the major challenges of the project was a product of its own success: It annotated too much information!

“This is good - it is what we wanted the system to do - but the quantity of data was vast, too much to handle, so we had to find ways to cut down on the amount of metadata,” Concejero tells ICT Results.

Manual override

So the project developed a manual annotation tool that can, with a little training, be used by non-technical people. “It is a very powerful, very advanced professional program. There are other manual annotation tools available commercially, but we have developed a strong and user-friendly program that could probably compete very successfully with what is currently available.”

For the project, the platform was developed to search video news sources relating to civil unrest and street violence, and natural disasters like earthquakes, forest fires and floods.

“We had to focus the demonstrator because there is a lot of work involved in developing ontologies for specific news topics. You would need to develop a very detailed ontology for politics, or crime and so on. We have designed the system so that it can accept ontologies from elsewhere, but for the demonstrator we reserved our work to these two domains,” says Concejero.

The beginning of the end?

The technology will not be challenging the industry leading search engines any time soon. This project does not necessarily mark the end of the type of keyword-based search that we use every day.

But it could well be the beginning of the end, and in the meantime the work of the MESH project will find a happy home in a number of stand-alone commercial applications and work will, in one way or another, continue to develop new applications.

More information: MESH project

This is part one of a two-part special feature on the MESH project.

Provided by ICT Results

Explore further: Brain inspired data engineering

add to favorites email to friend print save as pdf

Related Stories

Online tools help students search for meaning

Nov 11, 2008

(PhysOrg.com) -- With universities storing ever more teaching resources online, how do students and tutors find what they need? European researchers have devised novel ways to classify and locate teaching materials – and ...

You've got mail -- somewhere

Dec 20, 2007

New "smart" email search software from IBM can figure out what you are trying to find, even when you aren't so sure yourself. Its semantic search capabilities allow you to search on concepts and ideas rather than set-in-stone ...

A computer can pick out speech even amid cacophony

Nov 26, 2008

(PhysOrg.com) -- Using a recent development in speech recognition, it is possible to search through television news programmes provided the recognition system has been trained beforehand. PhD candidate Marijn ...

Grid browser finds the meaning of life

May 20, 2009

(PhysOrg.com) -- A web browser that can understand technical terms in life sciences and automatically find additional resources and services has been developed by European researchers. It could lead to a new generation of ...

Recommended for you

Brain inspired data engineering

11 hours ago

What if next-generation ICT systems could be based on the brain's structure and its cognitive and adaptive processes? A groundbreaking paradigm of brain-inspired intelligent ICT architectures is being born.

Forging a photo is easy, but how do you spot a fake?

Nov 21, 2014

Faking photographs is not a new phenomenon. The Cottingley Fairies seemed convincing to some in 1917, just as the images recently broadcast on Russian television, purporting to be satellite images showin ...

User comments : 1

Adjust slider to filter visible comments by rank

Display comments: newest first

RayCherry
not rated yet Oct 30, 2009
Language/Text dependent concept identification still provides barriers that will limit the use to specific culture (English speaking).

When the concepts themsleves achieve independence from Language/Text, then the system will be able to catalogue and cross reference multi-cultural concepts, providing a global 'map' of strong and weak linked concepts that underly the languages used to express/communicate them.

Eliminating the Language barrier within the machines, will help the users to do the same.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.