Software: running commentary for smarter surveillance?

Software: running commentary for smarter surveillance?
( -- Cutting-edge surveillance software that automatically detects human motion, behaviour and facial expressions, generates a running commentary of what?s happening and re-enacts events virtually could soon be helping police and security services.

The system, developed by a team of researchers from five European countries, provides a comprehensive and innovative solution to the information overload facing police forces and public and private security services.

With millions of surveillance cameras across Europe capturing what happens on city streets and major meeting points like airports, malls and buildings, monitoring and analysing these video streams has become an epic task. Technology such as automated motion detection, object tracking and behaviour analysis has eased some of the burden, but a gap continues to exist between what surveillance cameras see and how it can be described and interpreted in terms a human operator or computer can understand. Bridging this semantic gap is important because meaningful descriptions of events can trigger meaningful automated or human responses that could spot a crime in progress, prevent injuries or save lives.

“The semantic gap in the analysis of human behaviour from digital video is huge,” explains Andrew Bagdanov, a senior researcher at the Computer Vision Centre (CVC) of the Universitat Autonoma in Barcelona, Spain. “Most surveillance software operates only at a very low level… in order to bridge the gap it is necessary to build an artificial cognitive solution that operates at a much higher level, which is able to analyse footage, describe the events taking place and reason about what is going on.”

Thanks to research carried out by a multidisciplinary team working in the HERMES project, an EU-funded initiative named, fittingly, after the messenger of the gods in Greek mythology, such a solution now exists.

The state-of-the-art HERMES system consists of a scalable, flexible platform, integrating software components that not only detect events in real time as they are filmed by but also describe them semantically and react to them intelligently. It operates at three levels: tracking the movement of people and objects; monitoring the behaviour of people; and, in the case of high-resolution footage taken at close quarters, detecting changes in facial expression.

Monitoring motion, detecting behaviour

Whereas most surveillance video tracking systems operate in a state of perpetual surprise, dumbly following a single target and struggling to reacquire it if lost, the HERMES tracking technology functions more like a human monitoring the same scene, making predictions about where a target is heading and also reacting to any other events in the scene that appear unusual.

“Say two people meet in the street and start to run. The system will detect the change in behaviour and start to follow them. It could alert a human operator if the pattern of behaviour seems suspicious… such as if it appears someone has had their bag stolen,” Bagdanov, who oversaw the project’s validation activities, says.

Using a combination of static cameras, which provide an overall view of an area, and Pan-Tilt-Zoom (PTZ) cameras, so-called “eyes in the sky” that zoom and move to follow a target, the system is able to automatically track a person as they walk down a street or even across an entire city.

This smarter tracking is made possible by the HERMES researchers’ approach to solving the semantic gap. Instead of tracking objects in a scene directly - the current, low-level approach - the HERMES platform generates a running commentary in natural language text of what is going on: “A pedestrian labelled ‘Actor 3’ appears in the field of view,” “He moves on the southeastern sidewalk,” “Actor 3 stands nearby another pedestrian” etc.

This semantic information, generated automatically in real time, is then used by the artificial cognitive system to reason about events and behaviours of interest. Human operators, in turn, receive a more accurate description of what is occurring, and can more easily and quickly retrieve specific scenes from a recording with a simple text-based search. The current version of the system can generate text in six different languages.

3D models, automatically and in near real time

Generating semantic information from video in this way also enabled the HERMES researchers to develop another powerful tool as part of the system: a virtual 3D representation of the scene.

“The virtual graphical representation of the footage is generated in near real time and can be displayed alongside the actual video stream. Because it is virtual and 3D it allows operators to look at events from angles they would otherwise be unable to,” Bagdanov notes.

The outdoor applications for the system - focused, primarily, on motion and behaviour detection - were tested extensively in Barcelona earlier this year, where cameras attached to the CVC building were used to monitor events in the street outside.

“The system held up better than we expected, though when there are more than 20 people in the scene it starts to break down. This, however, is a problem that can be solved with more cameras and more computer processing power, so the system should scale well,” Bagdanov says.

Indoor applications of the system were developed and tested at ETH Zurich in Switzerland and Oxford University in the United Kingdom, both project partners. There, the facial expression recognition component showed the potential for the system to detect different emotions, especially powerful ones such as fear or anger.

Though facial expression detection does have security applications, Bagdanov notes that the technology could prove useful in research on human-computer interaction, for example, to make communication between humans and robots more natural.

“The HERMES project focused principally on developing technology for security and surveillance, but our research has uses in many other fields, not least human-computer interaction, natural language processing, multimedia communications and semantic annotation and search,” the project technical coordinator says.

He notes that several project partners are developing commercial applications based on the work carried out in HERMES, and that one or more spin-off companies are under consideration.

Explore further

'Smart' surveillance system may tag suspicious or lost people

More information: HERMES project -
Provided by ICT Results
Citation: Software: running commentary for smarter surveillance? (2010, March 16) retrieved 17 June 2019 from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Feedback to editors

User comments

Mar 16, 2010
Great system. I anticipate a few "big brother" comments to appear shortly. I for one see this as a great way to ensure the safety of citizens.

A gang member flashes a gun. The system scans his face and searches to see whether or not he has a permit for the weapon. If he does not, the information is sent to the nearest police unit equipped with facial recognition as well. The unit scans the crowds, detects criminal, and he is arrested.
Result: My fiance and I can walk down the street without a thug pulling a gun on us. Technology trumps idiotic criminals.

And no, I don't care if a system watches me shop at a mall or drive home. As long as it stays out of my house I'm o.k.

Mar 18, 2010
Don't worry, you'll want it in your home too. All you need is a little more fear; watch: What if it isn't in your house and you have a heart attack? If it isn't in your house, how would you know whether your finance was cheating? What if one of your house guests made a hateful facial expression while using your rest room? You would never know!

What if the energy put into creating products that can be made to sell via fear and mistrust went into producing systems that create joy and connection?

Mar 19, 2010
Recently all people lived in village. Anybody knows what is happening anywhere in this village, no secrets possible, but people did well, and no (secret) crimes were possible. Now that is lost and anybody is anonymous in big sites. System like this can make it a village again. I see no problem in that if information collected is public. Surveillance can be weapon only if one has the information and others do not.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more