# Algorithm for identifying object boundaries in digital images 50,000 times more efficient than predecessor

##### May 31, 2011 by Larry Hardesty

Determining the boundaries of objects is one of the central problems in computer vision. It's something humans do with ease: We glance out the window and immediately see cars as distinct from the sidewalk and street and the people walking by, or lampposts as distinct from the facades of the buildings behind them. But duplicating that facility in silicon has proven remarkably difficult.

One of the best ways for a computer to determine is to make lots of guesses and compare them; the boundaries that most of the guesses agree on are likeliest to be accurately drawn. Until now, that process has been monstrously time consuming. But Jason Chang, a graduate student in the Department of and Computer Science, and John Fisher, a principal research scientist at MIT's and Artificial Intelligence Lab (CSAIL), have figured out how to make it at least 50,000 times more efficient. Their findings could help improve systems for , for tracking and for 3-D , among others.

One reason that boundary determination — or as it's more commonly known, image segmentation — is such a hard problem is that there's no one right answer. Ask 10 people to trace the boundaries of objects in a digital image, and you'll likely get 10 different responses. "We want an algorithm that's able to segment images like humans do," Chang says. "But because humans segment images differently, we shouldn't come up with one segmentation. We should come up with a lot of different segmentations that kind of represent what humans would also segment."

Populating the field

To generate its set of candidate segmentations, Chang and Fisher's algorithm strikes different balances between two measures of segmentation quality. One measure is the difference between the parts of the image on opposite sides of each boundary. The most obvious way to gauge difference is by color value: A segmentation with blue pixels on one side of the boundary and red pixels on the other would be better than one that featured slightly different proportions of the same 30 shades of blue on both sides. But researchers have devised other, more subtle measures of difference, and Chang and Fisher's algorithm can use any of them.

The other measure of a segmentation's quality is its simplicity. If a computer is trying to segment a street scene that features a car, for instance, you probably don't want it to draw boundaries around every separate gleam of different-colored light on the car's hood. Simplicity and difference in appearance tend to be competing measures: It's easy to maximize color difference, for instance, if you draw a boundary around every pixel that's different from its neighbors, but such a segmentation would be ridiculously complex.

Chang and Fisher's algorithm assigns each segmentation a total score based on both simplicity and difference in appearance. But different segmentations could have roughly equivalent total scores: A segmentation that's a little too complex but has superb color difference, for instance, could have the same score as a segmentation that's pretty good on both measures. Chang and Fisher's algorithm is designed to find candidates with very high total scores. That ensures that none of the candidates will be outrageously bad, but it also makes the computation that much more complicated.

Other researchers have adopted the same general approach, but to generate their candidates, they adapted algorithms originally designed to find the one segmentation with the highest total score. Chang and Fisher realized that, because they were considering so many suboptimal possibilities anyway, they could use a less precise algorithm that runs much more efficiently. Although it's not essential to their approach that they find the highest-scoring segmentation, it's still likely that the many candidates they produce will include a few that are very close to it.

"There are a lot of competing methodologies out there, so it's hard for me to say that this is going to revolutionize segmentation," says Anthony Yezzi, a professor of electrical and computer engineering at the Georgia Institute of Technology. But, Yezzi says, the way in which Chang and Fisher's algorithm represents images is "an interesting new vehicle that I think could get a lot of mileage, even beyond ." The same technique, Yezzi says, could be applied to problems of object tracking — whether it's the motion of an object in successive frames of video or changes in a tumor's size over time — and pattern matching, where the idea is to recognize the similarity of objects depicted from slightly different angles or under different lighting conditions.

This story is republished courtesy of MIT News (web.mit.edu/newsoffice/), a popular site that covers news about MIT research, innovation and teaching.

Explore further: Seeing things: Researchers teach computers to recognize objects

## Related Stories

#### Seeing things: Researchers teach computers to recognize objects

October 13, 2009

(PhysOrg.com) -- If computers could recognize objects, they could automatically search through hours of video footage for a particular two-minute scene. A tourist strolling down a street in a strange city could take a cell-phone ...

#### Researchers Study Gene Regulation In Insects

April 27, 2006

Susan Brown, an associate professor of biology at Kansas State University, is interested in how evolution generates so much diversity in insects shapes and forms.

#### Automated MRI technique assists in earlier Alzheimer's diagnosis

June 24, 2008

An automated system for measuring brain tissue with magnetic resonance imaging (MRI) can help physicians more accurately diagnose Alzheimer's disease at an earlier stage according to a new study published in the July issue ...

#### Seeing the forest for the trees: Object recognition system breaks images into ever smaller parts

May 7, 2010

Object recognition is one of the core topics in computer vision research: After all, a computer that can see isn't much use if it has no idea what it's looking at. Researchers at MIT, working with colleagues at the University ...

#### What makes an image memorable? We tend to remember pictures of people much better than wide open spaces

May 24, 2011

Next time you go on vacation, you may want to think twice before shooting hundreds of photos of that scenic mountain or lake.

#### Segmentation is the secret behind the extraordinary diversification of animals

July 16, 2010

Segmentation, the repetition of identical anatomical units, seems to be the secret behind the diversity and longevity of the largest and most common animal groups on Earth. Researchers from CNRS and Universite Paris Diderot ...

## Recommended for you

#### For the first time, brain surface stimulation provides 'touch' feedback to direct movement

October 26, 2016

In the quest to restore movement to people with spinal cord injuries, researchers have focused on getting brain signals to disconnected nerves and muscles that no longer receive messages that would spur them to move.

#### Microsoft aims at Apple with high-end PCs, 3D software

October 26, 2016

Microsoft launched a new consumer offensive Wednesday, unveiling a high-end computer that challenges the Apple iMac along with an updated Windows operating system that showcases three-dimensional content and "mixed reality."

#### You are less anonymous on the web than you think—much less

October 26, 2016

If you still think you can be anonymous on the internet, a team of Stanford and Princeton researchers has news for you: You can't. Over the summer, the team launched what they called the Footprints Project, which invited ...

#### Making it easier to collaborate on code

October 26, 2016

Git is an open-source system with a polarizing reputation among programmers. It's a powerful tool to help developers track changes to code, but many view it as prohibitively difficult to use.

#### IEA hikes green energy forecast after 'turning point' year

October 25, 2016

Government support and lower costs will power stronger-than-expected global growth in renewable energy over the next five years, the International Energy Agency (IEA) said Tuesday.

#### Dutch unveil giant vacuum to clean outside air

October 25, 2016

Dutch inventors Tuesday unveiled what they called the world's first giant outside air vacuum cleaner—a large purifying system intended to filter out toxic tiny particles from the atmosphere surrounding the machine.

##### antialias
4.7 / 5 (3) May 31, 2011
The fusion of different algorithms is always an interesting concept (I did my own dissertation (mainly) on a novel segmentation algorithm for hard to detect boundaries. In itself a fusion of one algortithm from image segmentation and one from biomechanic modeling)

One of the quality criteria for a new segmentation approach is always how hard it is to parametrize the algorithm for best results. The above looks like it could do away with some parameters from the individual algorithms at the cost of introducing one new parameter for each (the weight the algorithm gets for the final score).

However I doubt that they get an efficiency of 50000 times (by what standard?). Maybe in extrem cases that make maximum use of the novel approach - but certainly not as a general measure.
##### JRDarby
not rated yet May 31, 2011
O nostalgia. I remember working with Canny exclusively back when I was creating software for use in automated vehicles.
##### Lordjavathe3rd
4.3 / 5 (3) May 31, 2011
Yeah, humans do it better, but we only look at what we understand is an object. Why not make the software impress upon the image what an object is most like, that way if there is an abnormality like a bird sitting on a branch, the software will separate the two. Humans do it like that, so why compare with out using the technique that people do? It should be a breeze.
##### antialias
4.3 / 5 (3) May 31, 2011
The 'unbreeze' part is to tell the computer what a bird is and what a branch is. That would mean giving it a sense of meaning to what is on the picture. I.e. you are trying to make your computer conscious of what it sees. That's not a breeze. That would be true artificial intelligence.
##### Royale
not rated yet May 31, 2011
I have to agree with antialias. Big O notation rates efficiency of algorithms. 50,000 seems a little extreme to me too. Just like simple sorting algorithms some of the most efficient ones are only efficient most of the time.
##### Lordjavathe3rd
2 / 5 (1) May 31, 2011
I'm far to uninformed to make an intelligent statement on how to go about computer vision. However I can't shake the feeling that it shouldn't be this difficult. We just need a language for describing objects and a way to expect what objects to find depending on the context... Well, at any rate I think this problem deserves way more funding, way more.
##### gwrede
1 / 5 (1) May 31, 2011
(A factor of 50,000 is only possible if the previous algorithm sucks.)

Unless the computer is familiar with the objects (or kinds of object) that may be present in the picture, object boundary recognition will be shaky at best. Actually, developing a reliable algorithm for finding the boundaries in arbitrary pictures, without resorting to databases of objects or a form gallery, should be provable to be impossible.

Hence, an object database is needed. And the more specific the assortment of pictures to analyze, the better and faster the software, mainly because of the smaller database.

Face recognition, spotting road signs, recognizing land usage type areas, the orientation of 3D objects on the product line, these all are examples that would be vastly simpler with a small database.

Optimization then becomes mostly charting the area between specific database objects and general notions about object form properties.
##### mattbroderick
1 / 5 (2) May 31, 2011
I know humans can do this very efficiently compared to machines, but we practice it every day and are aided by 3D data to help us improve! Give a machine 3D data and this becomes simpler in theory (but maybe harder computationally?). I don't see pure segmentation, with no additional resources like an object database or training, as a worthwhile goal.