August 30, 2018

Using deep-learning techniques to locate potential human activities in videos

by Agency for Science, Technology and Research (A*STAR), Singapore

When a police officer begins to raise a hand in traffic, human drivers realize that the officer is about to signal them to stop. But computers find it harder to work out people's next likely actions based on their current behavior. Now, a team of A*STAR researchers and colleagues has developed a detector that can successfully pick out where human actions will occur in videos, in almost real-time.

Image analysis technology will need to become better at understanding human intentions if it is to be employed in a wide range of applications, says Hongyuan Zhu, a computer scientist at A*STAR's Institute for Infocomm Research, who led the study. Driverless cars must be able to detect police officers and interpret their actions quickly and accurately, for safe driving, he explains. Autonomous systems could also be trained to identify suspicious activities such as fighting, theft, or dropping dangerous items, and alert security officers.

Computers are already extremely good at detecting objects in static images, thanks to deep learning techniques, which use artificial neural networks to process complex image information. But videos with moving objects are more challenging. "Understanding human actions in videos is a necessary step to build smarter and friendlier machines," says Zhu.

Previous methods for locating potential human actions in videos did not use deep-learning frameworks and were slow and prone to error, says Zhu. To overcome this, the team's YoTube detector combines two types of neural networks in parallel: a static neural network, which has already proven to be accurate at processing still images, and a recurring neural network, typically used for processing changing data, for speech recognition. "Our method is the first to bring detection and tracking together in one deep learning pipeline," says Zhu.

The team tested YoTube on more than 3,000 videos routinely used in computer vision experiments. They report that it outperformed state-of-the-art detectors at correctly picking out potential human actions by approximately 20 per cent for videos showing general everyday activities and around 6 per cent for sports videos. The detector occasionally makes mistakes if the people in the video are small, or if there are many people in the background. Nonetheless, Zhu says, "We've demonstrated that we can detect most potential human action regions in an almost real-time manner."

More information: Hongyuan Zhu et al. YoTube: Searching Action Proposal Via Recurrent and Static Regression Networks, IEEE Transactions on Image Processing (2018). DOI: 10.1109/TIP.2018.2806279

Provided by Agency for Science, Technology and Research (A*STAR), Singapore

Citation: Using deep-learning techniques to locate potential human activities in videos (2018, August 30) retrieved 17 July 2024 from https://phys.org/news/2018-08-deep-learning-techniques-potential-human-videos.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Detecting 'deepfake' videos in the blink of an eye

33 shares

Feedback to editors

New 3D anatomical atlas of the African clawed frog increases understanding of development and metamorphosis processes

8 hours ago

Intensive farming could raise risk of new pandemics, researchers warn

9 hours ago

Scientists develop new AI method to create material 'fingerprints'

12 hours ago

Study shows frogs can quickly increase their tolerance to pesticides

13 hours ago

Nature-based solutions to disaster risk from climate change are cost-effective, study confirms

13 hours ago

Astronomers discover what may be 21 neutron stars orbiting sun-like stars

13 hours ago

Scientists use machine learning to predict diversity of tree species in forests

14 hours ago

Physicists pool skills to better describe the unstable sigma meson particle

16 hours ago

Telescope tag-team discovers 10 strange and exotic pulsars

16 hours ago

NASA transmits hip-hop song to deep space for first time

16 hours ago

Load comments (0)

Using deep-learning techniques to locate potential human activities in videos

New 3D anatomical atlas of the African clawed frog increases understanding of development and metamorphosis processes

Intensive farming could raise risk of new pandemics, researchers warn

Scientists develop new AI method to create material 'fingerprints'

Study shows frogs can quickly increase their tolerance to pesticides

Nature-based solutions to disaster risk from climate change are cost-effective, study confirms

Astronomers discover what may be 21 neutron stars orbiting sun-like stars

Scientists use machine learning to predict diversity of tree species in forests

Physicists pool skills to better describe the unstable sigma meson particle

Telescope tag-team discovers 10 strange and exotic pulsars

NASA transmits hip-hop song to deep space for first time

Relevant PhysicsForums posts

Particle.js: Exploring Particle Physics with Web Technologies

Help solving a geometrical matching issue with Graph Neural Networks

5 GHz PC WiFi connection Cybersecurity question

Help with some optimization code for Block Matrices

Is an API Always Necessary for Server-Client Communication?

I did this POST message configuration damage to my wifi internet, help

Detecting 'deepfake' videos in the blink of an eye

A light-weight and accurate deep learning model for audiovisual emotion recognition

Deep-learning algorithm creates videos of the future

Computer program looks five minutes into the future

Eagle-eyed machine learning algorithm outdoes human experts

Robots do kitchen duty with cooking video dataset

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Medical Xpress

Tech Xplore

Science X

Using deep-learning techniques to locate potential human activities in videos

New 3D anatomical atlas of the African clawed frog increases understanding of development and metamorphosis processes

Intensive farming could raise risk of new pandemics, researchers warn

Scientists develop new AI method to create material 'fingerprints'

Study shows frogs can quickly increase their tolerance to pesticides

Nature-based solutions to disaster risk from climate change are cost-effective, study confirms

Astronomers discover what may be 21 neutron stars orbiting sun-like stars

Scientists use machine learning to predict diversity of tree species in forests

Physicists pool skills to better describe the unstable sigma meson particle

Telescope tag-team discovers 10 strange and exotic pulsars

NASA transmits hip-hop song to deep space for first time

Relevant PhysicsForums posts

Related Stories

Detecting 'deepfake' videos in the blink of an eye

A light-weight and accurate deep learning model for audiovisual emotion recognition

Deep-learning algorithm creates videos of the future

Computer program looks five minutes into the future

Eagle-eyed machine learning algorithm outdoes human experts

Robots do kitchen duty with cooking video dataset

Recommended for you

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Newsletter sign up

Donate and enjoy an ad-free experience