September 9, 2010

New method to help computer vision systems decipher outdoor scenes

Computer vision systems can struggle to make sense of a single image, but a new method devised by computer scientists at Carnegie Mellon University enables computers to gain a deeper understanding of an image by reasoning about the physical constraints of the scene.

In much the same way that a child might use a set of toy building blocks to assemble something that looks like a building depicted on the cover of the toy set, the computer would analyze an outdoor scene by using virtual blocks to build a three-dimensional approximation of the image that makes sense based on volume and mass.

"When people look at a photo, they understand that the scene is geometrically constrained," said Abhinav Gupta, a post-doctoral fellow in CMU's Robotics Institute. "We know that buildings aren't infinitely thin, that most towers do not lean, and that heavy objects require support. It might not be possible to know the three-dimensional size and shape of all the objects in the photo, but we can narrow the possibilities. In the same way, if a computer can replicate an image, block by block, it can better understand the scene."

This novel approach to automated scene analysis could eventually be used to understand not only the objects in a scene, but the spaces in between them and what might lie behind areas obscured by objects in the foreground, said Alexei A. Efros, associate professor of robotics and computer science at CMU. That level of detail would be important, for instance, if a robot needed to plan a route where it might walk, he noted.

Gupta presented the research, which he conducted with Efros and Robotics Professor Martial Hebert, at the European Conference on Computer Vision, Sept. 5-11 in Crete, Greece.

Understanding outdoor scenes remains one of the great challenges of artificial intelligence. One approach has been to identify features of a scene, such as buildings, roads and cars, but this provides no understanding of the geometry of the scene, such as the location of walkable surfaces. Another approach, which Hebert and Efros pioneered with former student Derek Hoiem, now of the University of Illinois, Urbana-Champaign, has been to map the planar surfaces of an image to create a rough 3-D depiction of an image, similar to a pop-up book. But that approach can lead to depictions that are highly unlikely and sometimes physically impossible.

In the new method devised by Gupta, Efros and Hebert, the image is first broken into various segments corresponding to objects in the image. Once the ground and sky are identified, other segments are assigned potential geometric shapes. The shapes also are categorized as light or heavy, depending on appearance; a surface that appears to be a brick wall, for instance, would be classified as heavy.

The computer then attempts to reconstruct the image using the virtual blocks. If a heavy block appears unsupported, the computer must substitute an appropriately shaped block, or make assumptions that the original block was obscured in the original image.

Gupta said because this qualitative volumetric approach to scene understanding is so new, no established datasets or evaluation methodologies exist for it. He said in estimating the layout of surfaces, other than sky and ground, the method is better than 70 percent accurate, and its performance is almost as good when comparing its segmentation to ground truth. Overall, Gupta assesses the analysis as very good for 30 to 40 percent of the images and adequate for another 20 to 30 percent.

Provided by Carnegie Mellon University

Citation: New method to help computer vision systems decipher outdoor scenes (2010, September 9) retrieved 19 September 2024 from https://phys.org/news/2010-09-method-vision-decipher-outdoor-scenes.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Seeing things: Researchers teach computers to recognize objects

0 shares

Feedback to editors

New research re-envisions Earth's mantle as a relatively uniform reservoir

2 hours ago

Gene-based model predicts when Japan's cherry buds awake from dormancy

2 hours ago

New technique zeros in on the genes that snakes use to produce venom

2 hours ago

Rugged Falklands landscape was once a lush rainforest, researchers say

11 hours ago

Antioxidant carbon dot nanozymes alleviate depression in rats by restoring the gut microbiome

14 hours ago

Can toddlers help explain the origins of our bias for wealth?

14 hours ago

New cosmic distance catalog could unlock the mysteries of universe formation

14 hours ago

Webb Telescope provides another look into galactic collisions

14 hours ago

Geoscientists confirm 'dripping' of Earth's crust beneath Türkiye's Central Anatolian Plateau

15 hours ago

Tracking plasma progression in a picosecond: Physicists develop ultra-fast laser method to study high-density plasmas

15 hours ago

Load comments (0)

New method to help computer vision systems decipher outdoor scenes

New research re-envisions Earth's mantle as a relatively uniform reservoir

Gene-based model predicts when Japan's cherry buds awake from dormancy

New technique zeros in on the genes that snakes use to produce venom

Rugged Falklands landscape was once a lush rainforest, researchers say

Antioxidant carbon dot nanozymes alleviate depression in rats by restoring the gut microbiome

Can toddlers help explain the origins of our bias for wealth?

New cosmic distance catalog could unlock the mysteries of universe formation

Webb Telescope provides another look into galactic collisions

Geoscientists confirm 'dripping' of Earth's crust beneath Türkiye's Central Anatolian Plateau

Tracking plasma progression in a picosecond: Physicists develop ultra-fast laser method to study high-density plasmas

Relevant PhysicsForums posts

Container shrinks at certain screen widths (CSS)

Unsolvable python code bug? (finding the difference between two input strings)

User-Defined Functions in Sql Server SSMS

Can Fortran 77 Code Be Used to Debug Python Code for Solving ODEs Using Radau5?

Help solving a geometrical matching issue with Graph Neural Networks

Zipping identical iterables

Seeing things: Researchers teach computers to recognize objects

Crime scene measurements can be taken from a single image

Findings about veracity of peripheral vision could lead to better robotic eyes (w/ Video)

Humans ignore motion and other cues in favor of a fictional stable world

Reconstruct Mars automatically in minutes

Context is ev ... well, something, anyway

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Medical Xpress

Tech Xplore

Science X

New method to help computer vision systems decipher outdoor scenes

New research re-envisions Earth's mantle as a relatively uniform reservoir

Gene-based model predicts when Japan's cherry buds awake from dormancy

New technique zeros in on the genes that snakes use to produce venom

Rugged Falklands landscape was once a lush rainforest, researchers say

Antioxidant carbon dot nanozymes alleviate depression in rats by restoring the gut microbiome

Can toddlers help explain the origins of our bias for wealth?

New cosmic distance catalog could unlock the mysteries of universe formation

Webb Telescope provides another look into galactic collisions

Geoscientists confirm 'dripping' of Earth's crust beneath Türkiye's Central Anatolian Plateau

Tracking plasma progression in a picosecond: Physicists develop ultra-fast laser method to study high-density plasmas

Relevant PhysicsForums posts

Related Stories

Seeing things: Researchers teach computers to recognize objects

Crime scene measurements can be taken from a single image

Findings about veracity of peripheral vision could lead to better robotic eyes (w/ Video)

Humans ignore motion and other cues in favor of a fictional stable world

Reconstruct Mars automatically in minutes

Context is ev ... well, something, anyway

Recommended for you

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Newsletter sign up

Donate and enjoy an ad-free experience