Computer scientists at Carnegie Mellon University have devised an innovative and elegantly concise algorithm that can efficiently solve systems of linear equations that are critical to such important computer applications as image processing, logistics and scheduling problems, and recommendation systems.
The theoretical breakthrough by Professor Gary Miller, Systems Scientist Ioannis Koutis and Ph.D. student Richard Peng, all of Carnegie Mellon's Computer Science Department, has enormous practical potential. Linear systems are widely used to model real-world systems, such as transportation, energy, telecommunications and manufacturing that often may include millions, if not billions, of equations and variables.
Solving these linear systems can be time consuming on even the fastest computers and is an enduring computational problem that mathematicians have sweated for 2,000 years. The Carnegie Mellon team's new algorithm employs powerful new tools from graph theory, randomized algorithms and linear algebra that make stunning increases in speed possible.
The algorithm, which applies to an important class of problems known as symmetric diagonally dominant (SDD) systems, is so efficient that it may soon be possible for a desktop workstation to solve systems with a billion variables in just a few seconds.
The work will be presented at the annual IEEE Symposium on Foundations of Computer Science (FOCS 2010), Oct. 23-36 in Las Vegas.
A myriad of new applications have emerged in recent years for SDD systems. Recommendation systems, such as the one used by Netflix to suggest movies to customers, use SDD systems to compare the preferences of an individual to those of millions of other customers. In image processing, SDD systems are used to segment images into component pieces, such as earth, sky and objects like buildings, trees and people. "Denoising" images to bring out lettering and other details that otherwise might appear as a blur also make use of SDD systems.
A large class of logistics, scheduling and optimization problems can be formulated as maximum-flow problems, or "max flow," which calculate the maximum amount of materials, data packets or vehicles that can move through a network, be it a supply chain, a telecommunications network or a highway system. The current theoretically best max flow algorithm uses, at its core, an SDD solver.
SDD systems also are widely used in engineering, such as for computing heat flow in materials or the vibrational modes of objects with complex shapes, in machine learning, and in computer graphics and simulations.
"In our work at Microsoft on digital imaging, we use a variety of fast techniques for solving problems such as denoising, image blending and segmentation," said Richard Szeliski, leader of the Interactive Visual Media Group at Microsoft Research. "The fast SDD solvers developed by Koutis, Miller and Peng represent a real breakthrough in this domain, and I expect them to have a major impact on the work that we do."
Finding methods to quickly and accurately solve simultaneous equations is an age-old mathematical problem. A classic algorithm for solving linear systems, dubbed Gaussian elimination in modern times, was first published by Chinese mathematicians 2,000 years ago.
"The fact that you can couch the world in linear algebra is super powerful," Miller said. "But once you do that, you have to solve these linear systems and that's often not easy."
A number of SDD solvers have been developed, but they tend not to work across the broad class of SDD problems and are prone to failures. The randomized algorithm developed by Miller, Koutis and Peng, however, applies across the spectrum of SDD systems.
The team's approach to solving SDD systems is to first solve a simplified system that can be done rapidly and serve as a "preconditioner" to guide iterative steps to an ultimate solution. To construct the preconditioner, the team uses new ideas from spectral graph theory, such as spanning tree and random sampling.
The result is a significant decrease in computer run times. The Gaussian elimination algorithm runs in time proportional to s3, where s is the size of the SDD system as measured by the number of terms in the system, even when s is not much bigger the number of variables. The new algorithm, by comparison, has a run time of s[log(s)]2. That means, if s = 1 million, that the new algorithm run time would be about a billion times faster than Gaussian elimination.
Other algorithms are better than Gaussian elimination, such as one developed in 2006 by Daniel Spielman of Yale University and Miller's former student, Shang-Hua Teng of the University of Southern California, which runs in s[log(s)]25. But none promise the same speed as the one developed by the Carnegie Mellon team.
"The new linear system solver of Koutis, Miller and Peng is wonderful both for its speed and its simplicity," said Spielman, a professor of applied mathematics and computer science at Yale. "There is no other algorithm that runs at even close to this speed. In fact, it's impossible to design an algorithm that will be too much faster."
Explore further: Programming smart molecules: Machine-learning algorithms could make chemical reactions intelligent
More information: The group's research paper, "Approaching Optimality for Solving SDD Linear Systems," can be downloaded at www.cs.cmu.edu/~glmiller/Publications/Papers/KoutisApproaching-2010.pdf