Seeing is understanding -- using artificial intelligence to analyse multimedia content

May 17, 2010
Seeing is understanding -- using artificial intelligence to analyse multimedia content

(PhysOrg.com) -- The media produce a glut of material daily. Refining that ore into the gold of useful information requires new approaches. European researchers have now made automated multimedia analysis much smarter.

Picture a few seconds of coverage from a sporting event, say the Wimbledon finals. Your television might show a snippet of action plus the players’ names, scores, and other text scrolling across the screen, while the audio feed might feature expert commentary.

Multiply that multimedia feed by every sporting event being broadcast anywhere in the world. Then toss in all the other activities covered by the media - news, politics, pop culture, not to mention YouTube and other social media. And finally, imagine trying to make sense of this torrent of information so that it can be categorised, labelled, indexed, searched and retrieved as needed.

That’s the challenge that the EU-funded research project BOEMIE (for Bootstrapping Ontology Evolution with Multimedia Information Extraction) accepted in 2006. They’ve now shown that by using state-of-the-art (AI) techniques to build and then refine highly structured knowledge bases, they can automatically or semi-automatically identify, analyse and index almost any multimedia content.

BOEMIE’s smart toolkit has significant commercial and research potential in any kind of multimedia annotation and retrieval. “Without semantic indexing, it’s very difficult to retrieve multimedia content,” says George Paliouras, BOEMIE’s technical manager. “BOEMIE offers a new approach to do this at a large scale and with high precision.”

This video is not supported by your browser at this time.

By your bootstraps

It’s impossible to pick oneself up by one’s own bootstraps, but BOEMIE manages a close approximation.
Of necessity, BOEMIE needs to start with some knowledge of the domain it will be analysing. That basic knowledge comes from domain experts who are prompted by the BOEMIE Semantic Manager to define and relate key concepts using natural language. For example, the concept “tennis match” might be defined as a type of sporting event, and the concept “Wimbledon finals” might be defined as an example of a tennis match.

BOEMIE automatically organises this information into an ontology - a formal way of representing concepts and the relationships between them within a chosen domain. Many AI applications use ontologies to represent knowledge about specific areas in a systematic and useful way.

When BOEMIE starts to analyse a multimedia feed, it uses newly developed video, image, audio and text analysis tools to extract as much information as it can. From the Wimbledon coverage, for example, it might note that there are two players interacting across a net on a playing surface of a particular size. This might allow it to categorise what it is “viewing” as a tennis match. From the audio track or on-screen text, it might tentatively connect the players with their names.

As the BOEMIE Ontology Evolution Toolkit tries to place the information it extracts into the existing ontology, it’s likely to discover that it needs new concepts. For example, it might notice that the commentator repeatedly uses the word “championship” or the phrase “Grand Slam.” The system automatically proposes these new concepts for the ontology, which can be accepted, rejected or modified by the domain expert.

Much like a human researcher, BOEMIE searches the web for needed information. For example, it might access Wikipedia or other sources to define a Grand Slam event, find out where Wimbledon is located in order to place it on a map, or find biographies of the contestants.

The key bootstrapping cycle is managed by the BOEMIE Bootstrapping Controller. After enriching the knowledge base, the system then re-analyses the same footage, guided by the newly enriched ontology. This lets the system extract even more information and propose still more refinements to the knowledge base.

“This cycle of improvement of our domain knowledge and then going back with that improved knowledge to extract even more knowledge can happen several times,” says Paliouras. “This is the novel aspect of BOEMIE.”

The BOEMIE package also includes a semantic browser that allows non-expert users to search for the multimedia information they need using the concepts and relationships BOEMIE has built up.

Putting BOEMIE through its paces

The BOEMIE researchers decided to test the system in the area of sports, where they knew they could find plenty of multimedia content and would not need to involve specialists to help build the knowledge base.

They found that the combination of BOEMIE’s content analysis tools, the natural language interface and flexible ontology building tool, and their novel bootstrapping approach allowed them to extract information from multimedia sports coverage much more efficiently and accurately than existing automated systems.

Paliouras points out that BOEMIE is not limited to sports. The toolkit can speed and improve the analysis, categorisation, indexing and retrieval of almost any kind of multimedia content. “BOEMIE can add value to any form of multimedia analysis, and make the work of a domain expert easier and more manageable,” he says.

Project coordinator Constantine Spyropoulos notes that a variety of potential customers are interested in implementing parts of the BOEMIE toolkit. The International Association of Athletics Federations wants to boost its content retrieval capabilities using BOEMIE. Advertisers are interested in how BOEMIE can help them reach particular audiences and monitor the exposure of their products. Politicians are intrigued by BOEMIE’s ability to filter a torrent of information to determine what people are saying about a particular issue, and news organisations are exploring how the system can help them.

“The methodology we’ve developed is universal,” says Spyropoulos. “It can apply to any area, any domain.”

Explore further: MIT groups develop smartphone system THAW that allows for direct interaction between devices

More information: BOEMIE project - www.boemie.org/

add to favorites email to friend print save as pdf

Related Stories

Grid browser finds the meaning of life

May 20, 2009

(PhysOrg.com) -- A web browser that can understand technical terms in life sciences and automatically find additional resources and services has been developed by European researchers. It could lead to a new generation of ...

Listen, watch, read -- computers search for meaning

Oct 30, 2009

(PhysOrg.com) -- European researchers have created the first integrated semantic search platform that integrates text, video and audio. The system can 'watch' films, 'listen' to audio and 'read' text to find relevant responses ...

Top designers in your own home?

Jul 27, 2009

(PhysOrg.com) -- A web tool that analyses sales and design data from European home textile producers, distributors and retailers is boosting product development - and industry competitiveness.

Laying the foundation for the next-generation Web

Mar 30, 2005

The Semantic Web lies at the heart of Tim Berners-Lee’s vision for the future of the Web, enabling a wide range of intelligent services. Thanks to the development of the infrastructure needed for the large-scale deployment ...

NEWS, but not as we know it

Apr 18, 2006

It will mean stories can be defined, on the fly, with a precision greater than a library's card catalogue. The News Engine Web Services (NEWS) platform is aimed at news agencies, governments and large enterprises and will ...

Recommended for you

Who drives Alibaba's Taobao traffic—buyers or sellers?

Sep 18, 2014

As Chinese e-commerce firm Alibaba prepares for what could be the biggest IPO in history, University of Michigan professor Puneet Manchanda dug into its Taobao website data to help solve a lingering chicken-and-egg question.

Computerized emotion detector

Sep 16, 2014

Face recognition software measures various parameters in a mug shot, such as the distance between the person's eyes, the height from lip to top of their nose and various other metrics and then compares it with photos of people ...

Cutting the cloud computing carbon cost

Sep 12, 2014

Cloud computing involves displacing data storage and processing from the user's computer on to remote servers. It can provide users with more storage space and computing power that they can then access from anywhere in the ...

Teaching computers the nuances of human conversation

Sep 12, 2014

Computer scientists have successfully developed programs to recognize spoken language, as in automated phone systems that respond to voice prompts and voice-activated assistants like Apple's Siri.

User comments : 2

Adjust slider to filter visible comments by rank

Display comments: newest first

ydroustan
not rated yet May 18, 2010
BOEMIE is a very helpful tool. I would not call it AI however. It still is unable to commit suicide or love another machine.
abhishekbt
not rated yet May 21, 2010
@ydroustan :- Even though emotions you speak of may be part of a highly evolved and sophisticated AI machine in our future, it is not the basic requriement.

The basic is being able to make a decision when presented with a set of facts. I agree though, that BOEMIE is more a tool than an AI system.