Genius of Einstein, Fourier key to new humanlike computer vision
(PhysOrg.com) -- Two new techniques for computer-vision technology mimic how humans perceive three-dimensional shapes by instantly recognizing objects no matter how they are twisted or bent, an advance that could help machines see more like people.
The techniques, called heat mapping and heat distribution, apply mathematical methods to enable machines to perceive three-dimensional objects, said Karthik Ramani, Purdue University's Donald W. Feddersen Professor of Mechanical Engineering.
"Humans can easily perceive 3-D shapes, but it's not so easy for a computer," he said. "We can easily separate an object like a hand into its segments - the palm and five fingers - a difficult operation for computers."
Both of the techniques build on the basic physics and mathematical equations related to how heat diffuses over surfaces.
"Albert Einstein made contributions to diffusion, and 18th century physicist Jean Baptiste Joseph Fourier developed Fourier's law, used to derive the heat equation," Ramani said. "We are standing on the shoulders of giants in creating the algorithms for these new approaches using the heat equation."
As heat diffuses over a surface it follows and captures the precise contours of a shape. The system takes advantage of this "intelligence of heat," simulating heat flowing from one point to another and in the process characterizing the shape of an object, he said.
Findings will be detailed in two papers being presented during the IEEE Computer Vision and Pattern Recognition conference on June 21-23 in Colorado Springs. The paper was written by Ramani, Purdue doctoral students Yi Fang and Mengtian Sun, and Minhyong Kim, a professor of pure mathematics at the University College London.
A major limitation of existing methods is that they require "prior information" about a shape in order for it to be analyzed.
"For example, in order to do segmentation you have to tell the computer ahead of time how many segments the object has," Ramani said. "You have to tell it that you are expecting, say, 10 segments or 12 segments."
The new methods mimic the human ability to properly perceive objects because they don't require a preconceived idea of how many segments exist.
"We are trying to come as close as possible to human segmentation," Ramani said. "A hot area right now is unsupervised machine learning. This means a machine, such as a robot, can perceive and learn without having any previous training. We are able to estimate the segmentation instead of giving a predefined number of segments."
The work is funded partially by the National Science Foundation. A patent on the technology is pending.
The methods have many potential applications, including a 3-D search engine to find mechanical parts such as automotive components in a database; robot vision and navigation; 3-D medical imaging; military drones; multimedia gaming; creating and manipulating animated characters in film production; helping 3-D cameras to understand human gestures for interactive games; contributing to progress of areas in science and engineering related to pattern recognition; machine learning; and computer vision.
The heat-mapping method works by first breaking an object into a mesh of triangles, the simplest shape that can characterize surfaces, and then calculating the flow of heat over the meshed object. The method does not involve actually tracking heat; it simulates the flow of heat using well-established mathematical principles, Ramani said.
Heat mapping allows a computer to recognize an object, such as a hand or a nose, no matter how the fingers are bent or the nose is deformed and is able to ignore "noise" introduced by imperfect laser scanning or other erroneous data.
"No matter how you move the fingers or deform the palm, a person can still see that it's a hand," Ramani said. "But for a computer to say it's still a hand is going to be hard. You need a framework - a consistent, robust algorithm that will work no matter if you perturb the nose and put noise in it or if it's your nose or mine."
The method accurately simulates how heat flows on the object while revealing its structure and distinguishing unique points needed for segmentation by computing the "heat mean signature." Knowing the heat mean signature allows a computer to determine the center of each segment, assign a "weight" to specific segments and then define the overall shape of the object.
"Being able to assign a weight to segments is critical because certain points are more important than others in terms of understanding a shape," Ramani said. "The tip of the nose is more important than other points on the nose, for example, to properly perceive the shape of the nose or face, and the tips of the fingers are more important than many other points for perceiving a hand."
In temperature distribution, heat flow is used to determine a signature, or histogram, of the entire object.
"A histogram is a two-dimensional mapping of a three-dimensional shape," Ramani said. "So, no matter how a dog bends or twists, it gives you the same signature."
The temperature distribution technique also uses a triangle mesh to perceive 3-D shapes. Both techniques, which could be combined in the same system, require modest computer power and recognize shapes quickly, he said.
"It's very efficient and very compact because you're just using a two-dimensional histogram," Ramani said. "Heat propagation in a mesh happens very fast because the mathematics of matrix computations can be done very quickly and well."
The researchers tested their method on certain complex shapes, including hands, the human form or a centaur, a mythical half-human, half-horse creature.
3-D mesh segmentation is a fundamental low-level task with applications in areas as diverse as computer vision, computer-aided design, bio-informatics, and 3-D medical imaging. A perceptually consistent mesh segmentation (PCMS), as defined in this paper is one that satisfies 1) invariance to isometric transformation of the underlying surface, 2) robust to the perturbations of the surface, 3) robustness to numerical noise on the surface, and 4) close conformation to human perception. We exploit the intelligence of the heat as a global structure-aware message on a meshed surface and develop a robust PCMS scheme, called Heat-Mapping based on the heat kernel. There are three main steps in Heat-Mapping. First, the number of the segments is estimated based on the analysis of the behavior of the Laplacian spectrum. Second, the heat center, which is defined as the most representative vertex on each segment, is discovered by a proposed heat center hunting algorithm. Third, a heat center driven segmentation scheme reveals showing PCMS with high consistency towards human perception. Extensive experimental results on various types of models verify the performance of Heat-Mapping with respect to the consistent segmentation of articulated bodies, the topological changes, and various levels of numerical noise.