New mathematical framework formalizes oddball programming techniques

May 23, 2012 by Larry Hardesty

Two years ago, Martin Rinard's group at MIT's Computer Science and Artificial Intelligence Laboratory proposed a surprisingly simple way to make some computer procedures more efficient: Just skip a bunch of steps. Although the researchers demonstrated several practical applications of the technique, dubbed loop perforation, they realized it would be a hard sell. "The main impediment to adoption of this technique," Imperial College London's Cristian Cadar commented at the time, "is that developers are reluctant to adopt a technique where they don't exactly understand what it does to the program."

Loop perforation is just one example of a way in which are looking to trade a little bit of accuracy for substantial gains in performance. Others include high-speed chips that yield slightly inaccurate answers to arithmetic problems and low-power memory circuits that don't guarantee perfectly faithful storage. But all of these approaches provoke skepticism among engineers: If a is intrinsically unreliable, how do we know it won't fail catastrophically?

At the Association for Computing Machinery's Conference on Programming Language Design and Implementation in June, Rinard's group will present a new that allows computer scientists to reason rigorously about sloppy computation. The framework can provide mathematical guarantees that if a computer program behaves as intended, so will a fast-but-inaccurate modification of it.

"Loop perforation shares with a lot of the research we've done this kind of happy-go-lucky, let's-give-it-a-go-and-see-what-happens approach," says Rinard, a professor in MIT's Department of Electrical Engineering and Computer Science. "But once you observe a phenomenon, it helps to understand why you see what you see and to put a formal framework around it."

Incentive structure

The new research, which also involved lead author Michael Carbin and his fellow graduate students Deokhwan Kim and Sasa Misailovic, fits into the general category of formal verification. Verification is a method for mathematically proving that a program does what it's supposed to. It's used in hardware design, in academic work on algorithms and in the development of critical software that can't tolerate bugs. But because it's labor intensive, it's rarely used in the development of commercial applications.

That's starting to change, however, as automated verification tools become more reliable and accessible. Carbin hopes that the performance gains promised by techniques such as loop perforation will give programmers an incentive to adopt formal verification techniques. "We're identifying all these opportunities where programmers can get much bigger benefits than they could before if they're willing to do a little verification," Carbin says. "If you have these large performance gains that just don't come about otherwise, then you can incentivize people to actually go about doing these things."

As its name might imply, loop perforation involves the computer routines known as loops, in which the same operation is performed over and over on a series of data; a perforated loop is one in which iterations of the operation are omitted at regular intervals. Like other techniques that trade accuracy for performance, Rinard explains, loop perforation works well with tasks that are, in the jargon of computer science, nondeterministic: They don't have to yield a single, inevitable answer. A few pixels in a frame of Internet video might be improperly decoded without the viewer's noticing; similarly, the precise order of the first three results of a Web search may not matter as much as returning the results in a timely fashion.

Drawing boundaries

With the researchers' framework, the programmer must specify "acceptability properties" for each procedure in a program. Those properties can be the types of things that formal verification typically ensures: that the result of a computation falls within a range of values, for instance, or that the output of a function adheres to a particular file format. But with the MIT framework, the programmer can also specify acceptability properties by reference to the normal execution of the program: The output of the modified procedure must be within 10 percent of the output of the unmodified procedure, say, or it must yield the same values, but not necessarily in the same order.

One advantage of the framework is that it allows developers who have already verified their programs to reuse much of their previous reasoning. In many cases, the programmer can define an acceptability property in such a way that, if it's met, the relaxed program will also preserve any other properties that the programmer has verified.

In the framework described in the new paper, the programmer must also describe how the procedure is to be modified, whether through loop perforation or some other technique. But Carbin says the group is already working on a computer system that allows the programmer to simply state acceptability properties; the system then automatically determines which modifications preserve those properties, with what types of performance gains.

"This idea of relaxation — trading the traditional notion that a computer has to do every part of a computation exactly correctly for huge energy savings, or performance savings, or ease of implementation — is not a new idea," says Dan Grossman, an associate professor of and engineering at the University of Washington, who also works on program relaxation. "But what this paper does is put in on a much firmer footing."

The paper also, Grossman says, shows "how you can formally verify software. But what it's doing by doing that is explaining what relaxed software actually is, what it means, what exactly it's relaxing."

Explore further: Big data may be fashion industry's next must-have accessory

More information: Paper: “Proving Acceptability Properties of Relaxed Nondeterministic Approximate Programs” (PDF)

Related Stories

Defibrillator for stalled software

Aug 03, 2011

It’s happened to everyone: You’re using a familiar piece of software to do something you’ve done a thousand times before — say, find a particular word in a document — and all of a ...

The next operating system

Feb 24, 2011

At the most basic level, a computer is something that receives zeroes and ones from either memory or an input device — like a keyboard — combines them in some systematic way, and ships the results ...

An oracle for object-oriented programmers

Oct 07, 2011

In the last 40 years, the major innovation in software engineering has been the development of what are called object-oriented programming languages. “Objects” are, effectively, repositories for ...

Recommended for you

Cloud computing helps make sense of cloud forests

Dec 17, 2014

The forests that surround Campos do Jordao are among the foggiest places on Earth. With a canopy shrouded in mist much of time, these are the renowned cloud forests of the Brazilian state of São Paulo. It is here that researchers ...

Teaching robots to see

Dec 15, 2014

Syed Saud Naqvi, a PhD student from Pakistan, is working on an algorithm to help computer programmes and robots to view static images in a way that is closer to how humans see.

User comments : 16

Adjust slider to filter visible comments by rank

Display comments: newest first

muggins
3 / 5 (2) May 23, 2012
Sounds like a good technique for AI depending on the performance increase and accuracy trade-off. Allowing a robot to scan a dynamically changing environment and allowing small inaccuracies would probably increase the speed a robot can perform a task.
gwrede
1.7 / 5 (3) May 23, 2012
There's a lower limit to reliability in any given task. With most tasks, this limit is subject to discussion. (Example, quality of video feed.)

So far, this looks innocent, but wait a decade.

With ever increasing corporate greed and competition, looks like the days of a good TV-experience are over. Soon you can't buy a calculator that's not even in the ballpark half the time. And matching your checking accout, forget it.

When you check your balance at the ATM or on the net, instead of a figure, you get either "I guess it's OK", "OK-ish", "Kind-of OK", "Not really OK", "Naaahh" or "There may be a problem". But what each means, isn't exactly defined either. It can't be because no exact limits are storable exactly and reliably.

Oh, boy. So I'm in the last generation to see Proper Computers. From now on, it's either quantum computing, or this Purple Haze stuff.
Phil DePayne
1 / 5 (1) May 23, 2012
Can anyone say
"I, Robot"?
Lurker2358
1.7 / 5 (3) May 23, 2012
"File saved...hopefully...maybe...some of it at least..."

"Data extracted, I think..."

"Copying...some mistakes are likely."

*enters correct password*: "Incorrect user name or password. Please try again."

*enters incorrect password*: "Close enough for me."

This is going to be worse than an Austin Powers movie, I swear...
kaasinees
1.8 / 5 (5) May 23, 2012
A good way is to hire a computer scientists to cooperate with the physicists or mathematician that actually knows about programming languages and system structures unlike most american "computer scientists"
krundoloss
not rated yet May 23, 2012
It seems like this is more akin to what humans do. Sounds great for A. I. , but not so much for real computers. As fast and powerful as computer hardware is these days, why would you bother with this technique? My computer at home has 6 CPU cores, 16 Gb of Quad Channel RAM, it doesn't have to "Guesstimate", it can figure it out for sure. If you guys want to improve program performance, learn how to make programs single threaded apps split into multiple CPU threads automatically!
xX_GT_Xx
5 / 5 (1) May 23, 2012
Not super useful for mathematical calculation, obviously. I see the benefit it logical comparisons though. Just think of the mechanics of a binary search on an array, where pinpoint accurancy is not needed until the very end.
Pressure2
not rated yet May 23, 2012
A short cut here and a short cut there, should fit right in. Isn't that the human way of doing things?
wwqq
not rated yet May 23, 2012
Computation is cheap. It's mostly memory operations you have to wait on.

I don't see how you're going to write some general code that gets a speed up on all architectures, because they will all use different cache line sizes.

On a sandy bridge CPU the cache line size is 64 byte. Thus, if you read a float, or a byte or whatever, the CPU will still insist on reading 64 bytes(16 floats, 8 doubles...) bytes from RAM and putting it in cache. That's the minimum amount of memory that can be transfered by RAM to CPU.

If you're reading every other 64-byte chunk that's not going to result in any speed up on CPUs with a 128-byte cache line size.
Code_Warrior
4.2 / 5 (5) May 23, 2012
If used appropriately, the technique is reasonable. In a typical environment where data drives decision making there is typically an initial weeding out of alternatives based upon gross approximations followed by a more accurate look at the options that remain. Why use slow, high accuracy methods to weed out options if you can achieve the same result with fast, gross approximations? It is done all the time in industry.

On the other hand, there is no advantage to inaccurate calculation of a bank balance since the math is mostly simple addition that is done extremely fast. The same with copying data; moving bytes is so simple and fast that it would be more complicated to incorporate loop perforation.

It's like any other tool. If you use it foolishly where accuracy is needed, then it will fail miserably. Use it as it is intended and it will improve performance.
cmn
not rated yet May 23, 2012
It seems like this is more akin to what humans do. Sounds great for A. I. , but not so much for real computers. As fast and powerful as computer hardware is these days, why would you bother with this technique? My computer at home has 6 CPU cores, 16 Gb of Quad Channel RAM, it doesn't have to "Guesstimate", it can figure it out for sure. If you guys want to improve program performance, learn how to make programs single threaded apps split into multiple CPU threads automatically!


Mine too, and it still takes months to crunch some of the numbers I work on.
cmn
not rated yet May 23, 2012
Computation is cheap. It's mostly memory operations you have to wait on.

I don't see how you're going to write some general code that gets a speed up on all architectures, because they will all use different cache line sizes.

On a sandy bridge CPU the cache line size is 64 byte. Thus, if you read a float, or a byte or whatever, the CPU will still insist on reading 64 bytes(16 floats, 8 doubles...) bytes from RAM and putting it in cache. That's the minimum amount of memory that can be transfered by RAM to CPU.

If you're reading every other 64-byte chunk that's not going to result in any speed up on CPUs with a 128-byte cache line size.


What if you read every other 128-byte chunk? ;) There's always speed-ups to be had by skipping steps, unless you ignore the obvious logistical nuances.
A_Paradox
not rated yet May 24, 2012
Code Warrior, I agree,
he process could speed up the creation of draft copies of surface renderings, or out of focus or distant features of scenery. This could perhaps speed up certain types of computer games.

Possibly also the manipulation of data about volumes of material could be sped up significantly, eg seismic surveys, volumetric radar and sonar scanning. An ability to adjust, on the fly, the algorithmic depth of nested and recursive loops and or the spacial frequency of sampling points might allow much faster searching in a whole lot of situations.

In fact this might be a way to do what Krundoloss asks; one or more CPU cores specialising in up-front fuzzy searching with other cores concentrating with much greater accuracy upon candidate 'objects of interest'.
PhotonX
not rated yet May 24, 2012
My computers crunch SETI data in their spare time. No harm there as far as I can see, since any positive hits will be triple checked anyway, so why not speed up the search and check ten times the data?
.
Good point about checking accounts. No where to go wrong with simple arithmetic.
wwqq
not rated yet May 24, 2012
What if you read every other 128-byte chunk? ;) There's always speed-ups to be had by skipping steps, unless you ignore the obvious logistical nuances.


May not work well on 64-byte cache line CPUs.

CPUs speculatively prefetch memory. They try to guess what you'll be needing next and fetching that a few hundred clocks in advance so that it's available when you do need it.

If you're reading every other cache line, it's almost certainly going to prefetch correctly. If you get two, skip two, get two, skip two... it may just read all of it(and pollute the cache with a bunch of irrelevant data).
jackofshadows
not rated yet May 28, 2012
I understand the motivation and the actual article. While I was serving in the military, in all the engineering projects (including software and systems) I used mathematical and logical rigor (formal verification). Since screwing up enough that people were harmed (or killed) could result in being sent to a place where the guards are a bunch of pissed off US Marines who really don't want to be there while you are doing hard time at hard labor. That definitely colors one's perception and procedures.

That said, we do have formal proofs on tap, more every year, it is time to examine exactly what boundary conditions mean to the expected accuracy (usability). Frankly, we should have been doing this for a while now. Those that actually understand the limits of accuracy and significance in mathematical calculations are more than halfway to understanding the limits and usability of rigor in engineering. Sometimes 'rule of thumb' IS the best answer.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.