Improving machine learning with an old approach

December 22, 2015 by Grayson York
Improving machine learning with an old approach
Rong Ge is an assistant professor of computer science.

Computer scientist Rong Ge has an interesting approach to machine learning. While most machine learning specialists will build an algorithm which molds to a specific dataset, Ge builds an algorithm which he can guarantee will perform well across many datasets.

A paper he wrote as a postdoc at Microsoft Research, Escaping From Saddle Points—Online Stochastic Gradient for Tensor Decomposition, describes how a programmer can use the imprecision of a common , known as stochastic gradient descent, to his advantage.

Normally this algorithm is used to speed up a slow learning process by only approximating the correct solution rather than working harder to get precision; however, Ge and his colleagues found that the small amount of "noise" created by the algorithm can be the saving grace of an algorithm which would otherwise be trapped by its own perfectionism.

"This algorithm is not normally used for this purpose," Ge says, "It is normally used as a heuristic to approximate the solution to a problem."

Noise allows the algorithm to escape from something called a "saddle point" on the function which the stochastic gradient is trying to optimize, which looks sort of like a sine wave. Ge describes gradient descent as being like a ball rolling down a hill. When on the slope of the hill it always seeks the lowest point, but if it is at a saddle point, a high point on a "ridge" between two different slopes, it will not start rolling.

Stochastic gradient descent remedies this problem by jostling the ball enough to start it rolling down one of the slopes. But one cannot be sure which way it will roll.

The results he has obtained relate to a certain branch of machine learning known as unsupervised learning. One common problem from unsupervised learning is "clustering," in which the algorithm attempts to find clusters of points which are similar to each other while different from the other points in the set. The algorithm then labels each of the clusters which it has found, and returns its solution to the programmer.

A key requirement for the final result of the algorithm to be correct is that the two slopes end at low points of equal depth, which represent optimal solutions. Often two solutions will appear different at first glance, but will actually represent the same set of clusters, different only because the labels on clusters were switched. Ge said one of the hardest parts of his process was designing functions that have this property.

These results are guaranteed to hold so long as the dataset is not provided by someone who has specifically engineered it to break the algorithm. If someone has designed such a dataset the problem becomes "NP hard," meaning that there is little hope for even the best algorithms to solve it.

Hopefully we will see more growth in this field, especially interesting results such as this which find that the weaknesses associated with a certain algorithm can actually be strengths under different circumstances.

Explore further: Microsoft open sources Distributed Machine Learning Toolkit for more efficient big data research

More information: Escaping From Saddle Points —- Online Stochastic Gradient for Tensor Decomposition.

Related Stories

Scott Aaronson on Google's new quantum-computing paper

December 14, 2015

In 2010, a Canadian company called D-Wave announced that it had begun production of what it called the world's first commercial quantum computer, which was based on theoretical work done at MIT. Quantum computers promise ...

Scientists teach machines to learn like humans

December 10, 2015

A team of scientists has developed an algorithm that captures our learning abilities, enabling computers to recognize and draw simple visual concepts that are mostly indistinguishable from those created by humans. The work, ...

Path-finder computes search strategy to find Waldo

February 9, 2015

London-born Where's Waldo? creator Martin Handford started out as a commercial illustrator with a specialty in drawing crowd scenes. By now we know how that talent supported his success and fame. His first book, Where's Waldo?, ...

Recommended for you

Microsoft aims at Apple with high-end PCs, 3D software

October 26, 2016

Microsoft launched a new consumer offensive Wednesday, unveiling a high-end computer that challenges the Apple iMac along with an updated Windows operating system that showcases three-dimensional content and "mixed reality."

Making it easier to collaborate on code

October 26, 2016

Git is an open-source system with a polarizing reputation among programmers. It's a powerful tool to help developers track changes to code, but many view it as prohibitively difficult to use.


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.