New program picks out targets in a crowd quickly and efficiently

February 22, 2019, Agency for Science, Technology and Research (A*STAR), Singapore
Credit: CC0 Public Domain

It can be harder for computers to find Waldo, an elusive character that hides within crowds in a popular children's book series, than it is for humans.

Now, an A*STAR researcher and her colleagues have developed a biologically-inspired program that could enable computers to identify real-life Waldos and other targets more efficiently.

Computer image analysis is routinely used in medicine, security, and rescue. Speed is often critical in these efforts, says Mengmi Zhang, a computer scientist at A*STAR's Institute for Infocomm Research, who led the study. She cites the use of computers to help find victims of natural disasters, such as earthquakes.

But these efforts are often hampered because computers lack human intuition. A person can quickly spot a dog in a crowded space, for instance, even if they have never seen that particular dog before. A computer, by contrast, needs to be trained using thousands of images of different dogs, and even then, they can falter when looking for a new dog whose image they have not encountered previously.

This weakness could be particularly problematic when scanning for weapons, says Zhang. A computer trained to look for knives and guns, might overlook another sharp object. "If there is one sharp metal stick which has not been seen in the training set, it doesn't mean the passenger should be able to take it on board the airplane," says Zhang.

Current computer searches also tend to be slow because the computer must scan every part of an image in sequence, paying equal attention to each part. Humans, however, rapidly shift their attention between several different locations in an image to find their target. Zhang and her colleagues' wanted to understand how humans do this so efficiently. They presented 45 people with crowded images and asked them to hunt for a target, say, a sheep. They monitored how the subjects' eyes darted around the scene, fixating briefly on different locations in the image. They found that, on average, people could locate the sheep in around 640 milliseconds. This corresponded to switching the location of their gaze, on average, just over two and a half times.

The team then developed a to implement this more human-like search strategy in the hunt for a dog. Rather than looking for a target that was identical to an image of a dog given beforehand, the was trained to look for something that had similar features to the example image. This enabled the model to generalize from a single dog image, to the "general concept of a dog," and quickly pick out other it had not seen before, explains Zhang.

The researchers tested how effective the new computer visual search model was by measuring the number of times the computer had to fixate on different locations in a scene before finding its target. "What surprises us is that by using our method, computers can search images as fast as humans, even when searching for objects they've never seen before," says Zhang. The was even as good as humans at finding Waldo.

The team is now programming their model with a better understanding of context. For example, humans naturally understand that a cup is more likely to be sitting on a table than floating in the air. Once implemented, this should improve the model's efficiency even further, says Zhang, adding, "Waldo cannot hide anymore."

Explore further: Tweaking tools to track tweets over time

More information: Mengmi Zhang et al. Finding any Waldo with zero-shot invariant and efficient visual search, Nature Communications (2018). DOI: 10.1038/s41467-018-06217-x

Related Stories

Tweaking tools to track tweets over time

February 18, 2019

Your social media posts reveal a lot about you. KAUST researchers have developed a dynamic computational model that can analyze tweets to identify Twitter users' interests and track changes over time. "Understanding the evolution ...

Fish and humans are alike in visual stimuli perception

February 4, 2019

Humans, fish and, most likely, other species rely on identical visual features—color, size, orientation, and motion—to quickly search for objects, according to researchers at Ben-Gurion University of the Negev (BGU).

Computers can perceive image curves like artists

November 23, 2015

Imagine computers being able to understand paintings or paint abstract images much like humans. Bo Li at Umeå University in Sweden demonstrates a breakthrough concept in the field of computer vision using curves and lines ...

Recommended for you

Galactic center visualization delivers star power

March 21, 2019

Want to take a trip to the center of the Milky Way? Check out a new immersive, ultra-high-definition visualization. This 360-movie offers an unparalleled opportunity to look around the center of the galaxy, from the vantage ...

Physicists reveal why matter dominates universe

March 21, 2019

Physicists in the College of Arts and Sciences at Syracuse University have confirmed that matter and antimatter decay differently for elementary particles containing charmed quarks.

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.