3-D mapping in real time, without the drift (w/ Video)

August 28, 2013 by Jennifer Chu

Computer scientists at MIT and the National University of Ireland (NUI) at Maynooth have developed a mapping algorithm that creates dense, highly detailed 3-D maps of indoor and outdoor environments in real time.

The researchers tested their algorithm on videos taken with a low-cost Kinect camera, including one that explores the serpentine halls and stairways of MIT's Stata Center. Applying their to these videos, the researchers created rich, three-dimensional maps as the camera explored its surroundings.

As the camera circled back to its starting point, the researchers found that after returning to a location recognized as familiar, the algorithm was able to quickly stitch images together to effectively "close the loop," creating a continuous, realistic 3-D map in real time.

The solves a major problem in the robotic mapping community that's known as either "loop closure" or "drift": As a camera pans across a room or travels down a corridor, it invariably introduces slight errors in the estimated path taken. A doorway may shift a bit to the right, or a wall may appear slightly taller than it is. Over relatively , these errors can compound, resulting in a disjointed map, with walls and stairways that don't exactly line up.

In contrast, the new mapping technique determines how to connect a map by tracking a camera's pose, or position in space, throughout its route. When a camera returns to a place where it's already been, the algorithm determines which points within the 3-D map to adjust, based on the camera's previous poses.

"Before the map has been corrected, it's sort of all tangled up in itself," says Thomas Whelan, a PhD student at NUI. "We use knowledge of where the camera's been to untangle it. The technique we developed allows you to shift the map, so it warps and bends into place."

The technique, he says, may be used to guide robots through potentially hazardous or unknown environments. Whelan's colleague John Leonard, a professor of mechanical engineering at MIT, also envisions a more benign application.

The video will load shortly
A visualization of the mapping process, producing dense maps at sub-centimeter resolution. Credit: THOMAS WHELAN AND JOHN MCDONALD/NUI-MAYNOOTH; MICHAEL KAESS AND JOHN J. LEONARD/MIT

"I have this dream of making a complete model of all of MIT," says Leonard, who is also affiliated with MIT's Computer Science and Artificial Intelligence Laboratory. "With this 3-D map, a potential applicant for the freshman class could sort of 'swim' through MIT like it's a big aquarium. There's still more work to do, but I think it's doable."

Leonard, Whelan and the other members of the team—Michael Kaess of MIT and John McDonald of NUI—will present their work at the 2013 International Conference on Intelligent Robots and Systems in Tokyo.

The problem with a million points

The Kinect camera produces a color image, along with information on the spacing of every pixel in that image. A depth sensor in the camera translates the pixel spacing into a measurement of depth, recording the depth of every single pixel in an image. This data can be parsed by an application to generate a 3-D representation of the image.

In 2011, a group from Imperial College London and Microsoft Research developed a 3-D mapping application called KinectFusion, which successfully produced 3-D models from Kinect data in real time. The technique generated very detailed models, at subcentimeter resolution, but is restricted to a fixed region in space.

Whelan, Leonard and their team expanded on that group's work to develop a technique to create equally high-resolution 3-D maps, over hundreds of meters, in various environments and in real time. The goal, they note, was ambitious from a data perspective: An environment spanning hundreds of meters would consist of millions of 3-D points. To generate an accurate map, one would have to know which points among the millions to align. Previous groups have tackled this problem by running the data over and over—an impractical approach if you want to create maps in .

Mapping by slicing

Instead, Whelan and his colleagues came up with a much faster approach, which they describe in two stages: a front end and a back end.

In the front end, the researchers developed an algorithm to track a camera's position at any given moment along its route. As the Kinect camera takes images at 30 frames per second, the algorithm measures how much and in what direction the camera has moved between each frame. At the same time, the algorithm builds up a 3-D model, consisting of small "cloud slices"—cross-sections of thousands of 3-D points in the immediate environment. Each cloud slice is linked to a particular camera pose.

As a camera moves down a corridor, cloud slices are integrated into a global 3-D map representing the larger, bird's-eye perspective of the route thus far.

In the back end, the technique takes all the poses that have been tracked and lines them up in places that look familiar. The technique automatically adjusts the associated cloud slices, along with their thousands of points—a fast approach that avoids having to determine, point by point, which to move.

The team has used its technique to create 3-D maps of MIT's Stata Center, along with indoor and outdoor locations in London, Sydney, Germany and Ireland. In the future, the group envisions that the technique may be used to give robots much richer information about their surroundings. For example, a 3-D map would not only help a robot decide whether to turn left or right, but also present more detailed information.

"You can imagine a robot could look at one of these maps and say there's a bin over here, or a fire extinguisher over here, and make more intelligent interpretations of the environment," Whelan says. "It's just a pick-up-and-go system, and we feel there's a lot of potential for this kind of technique."

Explore further: New system allows robots to continuously map their environment

Related Stories

Automatic building mapping could help emergency responders

September 24, 2012

MIT researchers have built a wearable sensor system that automatically creates a digital map of the environment through which the wearer is moving. The prototype system, described in a paper slated for the Intelligent Robots ...

Seeing depth through a single lens

August 5, 2013

Researchers at the Harvard School of Engineering and Applied Sciences (SEAS) have developed a way for photographers and microscopists to create a 3D image through a single lens, without moving the camera.

Skype eye contact finally possible (w/ Video)

August 27, 2013

(Phys.org) —Those separated from family and friends by long distances often use video conferencing services such as Skype in order to see each other when talking. But who hasn't experienced the frustration of your counterpart ...

Recommended for you

Sponge creates steam using ambient sunlight

August 22, 2016

How do you boil water? Eschewing the traditional kettle and flame, MIT engineers have invented a bubble-wrapped, sponge-like device that soaks up natural sunlight and heats water to boiling temperatures, generating steam ...

Apple issues update after cyber weapon captured

August 26, 2016

Apple iPhone owners on Friday were urged to install a quickly released security update after a sophisticated attack on an Emirati dissident exposed vulnerabilities targeted by cyber arms dealers.


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.