Robots learn by watching how-to videos

December 21, 2015 by Bill Steele
Robots learn by watching how-to videos
Scanning several videos on the same how-to topic, a computer finds instructions they have in common and combines them into one step-by-step series.

When you hire new workers you might sit them down to watch an instructional video on how to do the job. What happens when you buy a new robot?

Cornell researchers are teaching robots to watch instructional videos and derive a series of step-by-step instructions to perform a task. You won't even have to turn on the DVD player; the robot can look up what it needs on YouTube. The work is aimed at a future when we may have "personal robots" to perform everyday housework – cooking, washing dishes, doing the laundry, feeding the cat – as well as to assist the elderly and people with disabilities.

The researchers call their project "RoboWatch." Part of what makes it possible is that there is a common underlying structure to most how-to videos. And, there's plenty of source material available. YouTube offers 180,000 videos on "How to make an omelet" and 281,000 on "How to tie a bowtie." By scanning multiple videos on the same task, a computer can find what they all have in common and reduce that to simple step-by-step instructions in natural language.

Why do people post all these videos? "Maybe to help people or maybe just to show off," said graduate student Ozan Sener, lead author of a paper on the parsing method presented Dec. 16 at the International Conference on Computer Vision in Santiago, Chile. Sener collaborated with colleagues at Stanford University, where he is currently a visiting researcher.

A key feature of their system, Sener pointed out, is that it is "unsupervised." In most previous work, robot learning is accomplished by having a human explain what the robot is observing – for example, teaching a robot to recognize objects by showing it pictures of the objects while a human labels them by name. Here, a robot with a job to do can look up the instructions and figure them out for itself.

Faced with an unfamiliar task, the 's computer brain begins by sending a query to YouTube to find a collection of how-to videos on the topic. The algorithm includes routines to omit "outliers" – videos that fit the keywords but are not instructional; a query about cooking, for example, might bring up clips from the animated feature Ratatoullie, ads for kitchen appliances or some old Three Stooges routines.

The computer scans the videos frame by frame, looking for objects that appear often, and reads the accompanying narration - using subtitles – looking for frequently repeated words. Using these markers it matches similar segments in the various videos and orders them into a single sequence. From the subtitles of that sequence it can produce written instructions. In other research, robots have learned to perform tasks by listening to verbal instructions from a human. In the future, information from other sources such as Wikipedia might be added.

The learned knowledge from the YouTube videos is made available via RoboBrain, an online knowledge base robots anywhere can consult to help them do their jobs.

Explore further: Robots do kitchen duty with cooking video dataset

Related Stories

Robots do kitchen duty with cooking video dataset

January 5, 2015

Now that we have robots that walk, gesture and talk, roboticists are interested in a next level: How can they learn more than they already know? The ability of these machines to learn actions from human demonstrations is ...

Pancake-making PR2 spells teachable future in robotics

August 27, 2015

The RoboHow project has told the world what it's been up to in research at the High-Tech Systems 2015 fair and conference in the Netherlands. RoboHow is a four-year European research project that started in February 2012. ...

YouTube videos on peripheral nerve pain may misguide patients

October 21, 2015

Researchers who combed YouTube for videos regarding peripheral neuropathy, or nerve damage that causes weakness, numbness, and pain in the hands and feet, found 200 videos, but only about half of them were from healthcare ...

'Robo Brain' will teach robots everything from the Internet

August 25, 2014

Robo Brain – a large-scale computational system that learns from publicly available Internet resources – is currently downloading and processing about 1 billion images, 120,000 YouTube videos, and 100 million how-to documents ...

Recommended for you

Samsung to disable Note 7 phones in recall effort

December 9, 2016

Samsung announced Friday it would disable its Galaxy Note 7 smartphones in the US market to force remaining owners to stop using the devices, which were recalled for safety reasons.

Swiss unveil stratospheric solar plane

December 7, 2016

Just months after two Swiss pilots completed a historic round-the-world trip in a Sun-powered plane, another Swiss adventurer on Wednesday unveiled a solar plane aimed at reaching the stratosphere.

Solar panels repay their energy 'debt': study

December 6, 2016

The climate-friendly electricity generated by solar panels in the past 40 years has all but cancelled out the polluting energy used to produce them, a study said Tuesday.

11 comments

Adjust slider to filter visible comments by rank

Display comments: newest first

antialias_physorg
5 / 5 (1) Dec 21, 2015
videos that fit the keywords but are not instructional; a query about cooking, for example, might bring up clips from the animated feature Ratatoullie, ads for kitchen appliances or some old Three Stooges routines.

I wonder what would happen if one were to train a robot on videos of Three Stooges routines.
betterexists
1 / 5 (1) Dec 21, 2015
How about Adding some Genes to Dogs to Enable them too with such an Ability?
antialias_physorg
5 / 5 (2) Dec 21, 2015
How about Adding some Genes to Dogs

Erm...whut? What would "adding genes to dogs" accomplish in your opinion?
Do you even have any idea how genetics works?

(Hint: Adding DNA information does not confer memory of...anything. No matter how many Hollywood movies have used this piece of scientese BS)
betterexists
1 / 5 (1) Dec 21, 2015
Hitherto, We had been Programming the Computers.
Why NOT Teach MULTIPLE Programming Languages themselves to those Cornell University's Personal Robots?
And Voila, Get Ready for Eventual Debut of Quantum Computer!
antialias_physorg
5 / 5 (1) Dec 21, 2015
How about Adding some Genes to Dogs to Enable them too with such an Ability?


Why NOT Teach MULTIPLE Programming Languages themselves to those Cornell University's Personal Robots?
And Voila, Get Ready for Eventual Debut of Quantum Computer!


You really like mixing things together that have absolutely nothing to do with one another, don't you?
Hint: Showing ignorance of a subject isn't smart. Showing it as persistently as you do even less so.

Please, at least head over to wikipedia before you use any scientific sounding words in the future. This will save you no end of embarassment.
Whydening Gyre
5 / 5 (1) Dec 21, 2015
videos that fit the keywords but are not instructional; a query about cooking, for example, might bring up clips from the animated feature Ratatoullie, ads for kitchen appliances or some old Three Stooges routines.

I wonder what would happen if one were to train a robot on videos of Three Stooges routines.

It'd be looking for the grouse...
Captain Stumpy
not rated yet Dec 21, 2015
I wonder what would happen if one were to train a robot on videos of Three Stooges routines.
it would still be trying to figure out Who's on first?

or we would have a new "Jackass" type star in the media and create a typical average modern first-world youth without critical thinking skills that thinks AGW isn't real and science is fake...

.

.

PS- sorry to those who actually CAN think for themselves and research a topic without going all "add genes to a dog to make it smarter" on everyone
i am usually always disillusioned at this time of year due to hypocrisy and the season
Whydening Gyre
5 / 5 (1) Dec 21, 2015
i am usually always disillusioned at this time of year due to hypocrisy and the season

C'mon Cap... Where's the Christmas spirit of forgiveness? :-)
Whydening Gyre
5 / 5 (1) Dec 21, 2015
it would still be trying to figure out Who's on first?

Only if they watched Abbott and Costello video...;-)

.

antigoracle
3 / 5 (2) Dec 21, 2015
Wait till they find those cat videos.
SkyLy
not rated yet Dec 24, 2015
How about Adding some Genes to Dogs

Erm...whut? What would "adding genes to dogs" accomplish in your opinion?
Do you even have any idea how genetics works?

(Hint: Adding DNA information does not confer memory of...anything. No matter how many Hollywood movies have used this piece of scientese BS)


Wait...what ? If our anatomy is not written into our DNA, then where is the second repository of the genetic information you're talking about ? The Soul ?

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.