June 23, 2016

Computer vision system studies word use to recognize objects it has never seen before

by Disney Research

Computer vision systems typically learn how to recognize an object by analyzing images of thousands of examples. But scientists at Disney Research have shown that computers also can learn to recognize objects they have never seen before, based in part on studying vocabulary.

People, after all, can get an idea of what things might look like based on reading a book. Similarly, a computer that already has been taught to recognize certain objects - apples, for instance - can analyze word use to get hints about the existence of fruits such as pears and peaches, and how they might differ from apples, said Leonid Sigal, senior research scientist at Disney Research.

The knowledge that other fruits exist also is helpful in teaching the computer about important characteristics of apples themselves, he added.

"This opens the door to a new learning paradigm," Sigal said. By reducing the need to train vision systems with thousands of labeled images, it could help reduce the time necessary for computers to learn new objects and expand the number of object categories that computers can recognize.

Sigal and Yanwei Fu, a post-doctoral researcher at Disney Research, will present this new learning model, called semi-supervised vocabulary-informed learning, at the IEEE Conference on Computer Vision Pattern Recognition, CVPR 2016, June 26 in Las Vegas.

"We've seen unprecedented advances in object recognition and object categorization in recent years, thanks to the development of convolutional neural networks," said Jessica Hodgins, vice president at Disney Research. "But the need to train vision software with thousands of labeled examples for each object has created a bottleneck and limited the number of object classes that can be recognized. Vocabulary-informed learning promises to break that bottleneck and make computer vision more useful and reliable. "

For this study, the computer learned its vocabulary by being trained against all of the articles in Wikipedia and UMBC WebBase, a dataset with three billion English words. From those articles, it gleaned more than 300,000 object categories and discovered statistical associations between them. For instance, the computer may have been trained to recognize cars and buses, but from the word analysis it could surmise that there are other categories of vehicles, such as vans, mini-vans and SUVs, and get hints about how each differs from a car or a bus based on its linguistic use.

Simply knowing that these categories exist helps the system as it is trained with images to recognize objects, Sigal said, resulting in the creation of better models for seen objects. Information it gets from the vocabulary analysis can then also suggest how it might recognize other, as-yet unseen objects. If it knows what an apple looks like, for instance, the vocabulary may suggest that a pear, which it has never seen, might be of similar size, but elongated.

"I've never been to Africa, but I read books so I know what to expect," Fu said. "We use our brains to organize information and contextualize how unknown things might look. Compared with previous semi-supervised learning, our vocabulary-informed paradigm is perhaps more similar to how humans reason.

In their testing, Sigal and Fu found that semi-supervised, vocabulary-informed learning worked better and required fewer training examples than other learning techniques, including zero-shot learning, a widely studied approach that introduces new objects during testing, rather than during training.

According to Sigal, computer vision systems now can recognize thousands of objects, but with this new method they can learn to recognize 300,000 categories based on the vocabulary it developed.

"We didn't try to mimic humans exactly, but making the learning approach more human-like was a motivating factor," Sigal said. "It is a different form of learning and so will motivate researchers to develop different types of algorithms."

More information: "Semi-supervised Vocabulary-informed Learning-Paper" [PDF, 3.49 MB]

Provided by Disney Research

Citation: Computer vision system studies word use to recognize objects it has never seen before (2016, June 23) retrieved 17 July 2024 from https://phys.org/news/2016-06-vision-word.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Object and scene recognition software work together to understand video content

45 shares

Feedback to editors

New 3D anatomical atlas of the African clawed frog increases understanding of development and metamorphosis processes

9 hours ago

Intensive farming could raise risk of new pandemics, researchers warn

10 hours ago

Scientists develop new AI method to create material 'fingerprints'

13 hours ago

Study shows frogs can quickly increase their tolerance to pesticides

14 hours ago

Nature-based solutions to disaster risk from climate change are cost-effective, study confirms

14 hours ago

Astronomers discover what may be 21 neutron stars orbiting sun-like stars

14 hours ago

Scientists use machine learning to predict diversity of tree species in forests

15 hours ago

Physicists pool skills to better describe the unstable sigma meson particle

16 hours ago

Telescope tag-team discovers 10 strange and exotic pulsars

17 hours ago

NASA transmits hip-hop song to deep space for first time

17 hours ago

Load comments (0)

Computer vision system studies word use to recognize objects it has never seen before

New 3D anatomical atlas of the African clawed frog increases understanding of development and metamorphosis processes

Intensive farming could raise risk of new pandemics, researchers warn

Scientists develop new AI method to create material 'fingerprints'

Study shows frogs can quickly increase their tolerance to pesticides

Nature-based solutions to disaster risk from climate change are cost-effective, study confirms

Astronomers discover what may be 21 neutron stars orbiting sun-like stars

Scientists use machine learning to predict diversity of tree species in forests

Physicists pool skills to better describe the unstable sigma meson particle

Telescope tag-team discovers 10 strange and exotic pulsars

NASA transmits hip-hop song to deep space for first time

Relevant PhysicsForums posts

Particle.js: Exploring Particle Physics with Web Technologies

Help solving a geometrical matching issue with Graph Neural Networks

5 GHz PC WiFi connection Cybersecurity question

Help with some optimization code for Block Matrices

Is an API Always Necessary for Server-Client Communication?

I did this POST message configuration damage to my wifi internet, help

Object and scene recognition software work together to understand video content

Team develops vision system that improves object recognition

New computer vision algorithm predicts orientation of objects

New method detects human activity in videos earlier and more accurately

Baby talk words with repeated sounds help infants learn language

Machines can learn to respond to new situations like human beings would

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Medical Xpress

Tech Xplore

Science X

Computer vision system studies word use to recognize objects it has never seen before

New 3D anatomical atlas of the African clawed frog increases understanding of development and metamorphosis processes

Intensive farming could raise risk of new pandemics, researchers warn

Scientists develop new AI method to create material 'fingerprints'

Study shows frogs can quickly increase their tolerance to pesticides

Nature-based solutions to disaster risk from climate change are cost-effective, study confirms

Astronomers discover what may be 21 neutron stars orbiting sun-like stars

Scientists use machine learning to predict diversity of tree species in forests

Physicists pool skills to better describe the unstable sigma meson particle

Telescope tag-team discovers 10 strange and exotic pulsars

NASA transmits hip-hop song to deep space for first time

Relevant PhysicsForums posts

Related Stories

Object and scene recognition software work together to understand video content

Team develops vision system that improves object recognition

New computer vision algorithm predicts orientation of objects

New method detects human activity in videos earlier and more accurately

Baby talk words with repeated sounds help infants learn language

Machines can learn to respond to new situations like human beings would

Recommended for you

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Newsletter sign up

Donate and enjoy an ad-free experience