A visual saliency technique that can detect and extract relevant information from both still and moving images has many applications for computer image processing. Such a technique can be used to detect motion, distinguish different objects and improve the quality of specific parts of an image through selective compression.
Shijian Lu and colleagues from the A*STAR Institute for Infocomm Research in Singapore have developed a robust and efficient method for capturing such salient information from images and movies. They found that the key lies in the distribution of brightness and color between pairs of pixels.
Digital images are encoded as pixels, or points in an image. To detect an object (for example, a person standing in the foreground), brightness variations between neighboring pixels could be compared. However, considering just individual pixels can be deceiving, as the context is important when seeking to distinguish between important details and unimportant background information.
The technique developed by the A*STAR researchers hence involves counting the pixels in an image based on their color and then plotting the distribution. This makes not only the distribution of colors apparent, but also the frequency at which pairs or neighboring pixels appear. A low frequency of pixel pairs with a certain difference in color indicates a region of high interest, as it denotes clear boundaries between objects. In this way, the salient features can be easily identified; for example, not only large areas of contrast in a photograph, such as a yellow school bus in front of a neutral background, but also contrasts in smaller areas, such as a person wearing a safety vest (see image).
"Our model has great potential for predicting the point in an image that will attract the human eye," comments Lu. "Apart from generic object detection, it can be applied to tasks such as guiding robots or to the smart design of web pages and advertisements."
The next step for the researchers will be to apply this scheme to detecting motion in videos, which follows similar rules as identifying relevant information in still photographs. Moreover, Lu says that their algorithm enables more complex approaches to image analysis.
"An example is the development of computational modeling of the top-down approach of humans looking at a scene," says Lu. "Combining our bottom-up modeling algorithm with a top-down visual search could solve many challenging computer vision problems—such as anomaly detection or target search—in a more robust, efficient and cognitive manner."
Explore further: Crowdsourced computational expertise to advance the social good
More information: Lu, S., Tan, C. & Lim, J.-H. "Robust and efficient saliency modeling from image co-occurrence histograms." IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 195–201 (2014). dx.doi.org/10.1109/TPAMI.2013.158