Researchers improve automated recognition of human body movements in videos

June 8, 2015, Disney Research
Researchers improve automated recognition of human body movements in videos
Credit: Disney Research

An algorithm developed through collaboration of Disney Research Pittsburgh and Boston University can improve the automated recognition of actions in a video, a capability that could aid in video search and retrieval, video analysis and human-computer interaction research.

The core idea behind the new method is to express each action, whether it be a pedestrian strolling down the street or a gymnast performing a somersault, as a series of space-time patterns. These begin with elementary , such as a leg moving up or an arm flexing. But these movements also combine into spatial and temporal patterns - a leg moves up and then the arm flexes, or the leg moves up below the arm. Or the pattern might describe a hierarchical relationship, such as a leg moving as part of the more global movement of the body.

"Modeling actions as collections of such patterns, instead of a single pattern, allows us to account for variations in action execution," said Leonid Sigal, senior research scientist at Disney Research Pittsburgh. It thus is able to recognize a tuck or a pike movement by a diver, even if one diver's execution differs from another, or if the angle of the dive blocks a part of one diver's body, but not that of another diver's.

The method, developed by Sigal along with Shugao Ma, a Disney Research intern who is a Ph.D. student at Boston University, and his adviser, Prof. Stan Sclaroff, can detect these structures directly from a video without fine-level supervision.

"Knowing that a video contains 'walking' or 'diving,' we automatically discover which elements are most important and discriminative and with which relationships they need to be stringed together into patterns to improve recognition," Ma said.

Representing actions in this way as space-time trees, not only improves recognition of actions but also is concise and efficient, said the researchers, who will present their findings at the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, June 7-12 in Boston.

The researchers tested the method using standard datasets including 10 different sports actions and four interactions, such as kisses and hugs depicted on TV programs. On these datasets, the Disney algorithm outperformed previously reported methods, identifying elements with a richer set of relations among them. They also found that the space-time trees discovered in one dataset to discriminate between hugs and kisses performed promisingly in discriminating hugs or kisses from other actions, such as eating and using a phone, that were not part of the original dataset.

Explore further: Mistletoe hugs, kisses spotted by computer (w/ Video)

More information: "Space-Time Tree Ensemble for Action Recognition-Paper" … ecognition-Paper.pdf

Related Stories

Recommended for you

Coffee-based colloids for direct solar absorption

March 22, 2019

Solar energy is one of the most promising resources to help reduce fossil fuel consumption and mitigate greenhouse gas emissions to power a sustainable future. Devices presently in use to convert solar energy into thermal ...

EPA adviser is promoting harmful ideas, scientists say

March 22, 2019

The Trump administration's reliance on industry-funded environmental specialists is again coming under fire, this time by researchers who say that Louis Anthony "Tony" Cox Jr., who leads a key Environmental Protection Agency ...

The taming of the light screw

March 22, 2019

DESY and MPSD scientists have created high-order harmonics from solids with controlled polarization states, taking advantage of both crystal symmetry and attosecond electronic dynamics. The newly demonstrated technique might ...


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.