New frontier in error-correcting codes

October 1, 2014 by Larry Hardesty
Credit: Jose-Luis Olivares/MIT

Error-correcting codes are one of the glories of the information age: They're what guarantee the flawless transmission of digital information over the airwaves or through copper wire, even in the presence of the corrupting influences that engineers call "noise."

But classical error-correcting codes work best with large chunks of data: The bigger the chunk, the higher the rate at which it can be transmitted error-free. In the Internet age, however, distributed computing is becoming more and more common, with devices repeatedly exchanging small chunks of data over long periods of time.

So for the last 20 years, researchers have been investigating interactive-coding schemes, which address the problem of long sequences of short exchanges. Like classical error-correcting codes, interactive codes are evaluated according to three criteria: How much noise can they tolerate? What's the maximum transmission rate they afford? And how time-consuming are the encoding and decoding processes?

At the IEEE Symposium on Foundations of Computer Science this month, MIT graduate students past and present will describe the first interactive coding scheme to approach the optimum on all three measures.

"Previous to this work, it was known how to get two out of three of these things to be optimal," says Mohsen Ghaffari, a graduate student in electrical engineering and computer science and one of the paper's two co-authors. "This paper achieves all three of them."

Vicious noise

Moreover, where Claude Shannon's groundbreaking 1948 analysis of error-correcting codes considered the case of random noise, in which every bit of transmitted data has the same chance of being corrupted, Ghaffari and his collaborator—Bernhard Haeupler, who did his graduate work at MIT and is now an assistant professor at Carnegie Mellon University—consider the more stringent case of "adversarial noise," in which an antagonist is trying to interfere with transmission in the most disruptive way possible.

"We don't know what type of random noise will be the one that actually captures reality," Ghaffari explains. "If we knew the best one, we would just use that. But generally, we don't know. So you try to generate a coding that is as general as possible." A coding scheme that could thwart an active adversary would also thwart any type of .

Error-correcting codes—both classical and interactive—work by adding some extra information to the message to be transmitted. They might, for instance, tack on some bits that describe arithmetic relationships between the message bits. Both the message bits and the extra bits are liable to corruption, so decoding a message—extracting the true sequence of message bits from the sequence that arrives at the receiver—is usually a process of iterating back and forth between the message bits and the extra bits, trying to iron out discrepancies.

In interactive communication, the maximum tolerable error rate is one-fourth: If the adversary can corrupt more than a quarter of the bits sent, perfectly reliable communication is impossible. Some prior interactive-coding schemes, Ghaffari explains, could handle that error rate without requiring too many extra bits. But the decoding process was prohibitively complex.

Making a list

To keep the complexity down, Ghaffari and Haeupler adopted a technique called list decoding. Rather than iterating back and forth between message bits and extra bits until the single most probable interpretation emerges, their algorithm iterates just long enough to create a list of likely candidates. At the end of their mutual computation, each of the interacting devices may have a list with hundreds of entries.

But each device, while it has only imperfect knowledge of the messages sent by the other, has perfect knowledge of the messages it sent. So if, at the computation's end, the devices simply exchange lists, each has enough additional information to zero in on the optimal decoding.

The maximum tolerable error rate for an interactive-coding scheme—one-fourth—is a theoretical result. The minimum length of an encoded message and the minimum decoding complexity, on the other hand, are surmises based on observation.

But Ghaffari and Haeupler's decoding algorithm is nearly linear, meaning that its execution time is roughly proportional to the length of the messages exchanged.

But linear relationships are still defined by constants: y = x is a linear relationship, but so is y = 1,000,000,000x. A linear algorithm that takes an extra second of computation for each additional bit of data it considers isn't as good as a linear algorithm that takes an extra microsecond.

Explore further: Explained: The Shannon limit

More information: Paper: "Optimal Error Rates for Interactive Coding II: Efficiency and List Decoding." people.csail.mit.edu/ghaffari/papers/interactive_protocol2.pdf

Related Stories

Explained: The Shannon limit

January 19, 2010

It's the early 1980s, and you’re an equipment manufacturer for the fledgling personal-computer market. For years, modems that send data over the telephone lines have been stuck at a maximum rate of 9.6 kilobits per second: ...

Explained: Gallager codes

January 21, 2010

In the 1948 paper that created the field of information theory, MIT grad (and future professor) Claude Shannon threw down the gauntlet to future generations of researchers. In those predigital days, communications channels ...

Perfect communication with imperfect chips

August 5, 2011

One of the triumphs of the information age is the idea of error-correcting codes, which ensure that data carried by electromagnetic signals — traveling through the air, or through cables or optical fibers — can ...

Reliable communication, unreliable networks

August 5, 2013

Now that the Internet's basic protocols are more than 30 years old, network scientists are increasingly turning their attention to ad hoc networks—communications networks set up, on the fly, by wireless devices—where ...

Recommended for you

Where can I buy a chair like that? This app will tell you

August 23, 2016

If you think you have a knack for interior design, or just want to spruce up your own home, new technology developed by Cornell researchers may help you choose furnishings the way professionals do. And professionals may find ...

6 comments

Adjust slider to filter visible comments by rank

Display comments: newest first

verkle
Oct 02, 2014
This comment has been removed by a moderator.
Pattern_chaser
1 / 5 (1) Oct 02, 2014
This weird article seems just about to say something of substance, then it ends. I'm baffled. Has something new been discovered or achieved, or is it just an article that says people would like to improve EC, if they can? Have I missed something?
Mike_Massen
3.4 / 5 (5) Oct 02, 2014
verkle commented
..Now we must take yet another approach to optimize for noise conditions. I'm sure there is still much more to invent in the future.
It should be known genetic algorithms have been used in design to generate product designs, all that is needed is a suitable environment where there are various differentials in conjunction with fixed rules of interaction.

Great complexity has been shown to arise from simplicity. It only requires time and space and both can be vast... !

Trial & Error does seem to be the core aspect of nature, founded on chemistry.

Such approaches re Trial & Error are also evident in steps to craft solutions to products, the resulting design arises or rather evolves from this process being exercised many times...

Its clear to information tech students the EC codes in DNA could be vastly improved, rather obvious it is not optimum by any means !

@Pattern_chaser
Many articles could have more substance, a useful introduction it seems at best.
antialias_physorg
5 / 5 (1) Oct 02, 2014
The issue of noise is not as straight forward as one might think in. Stochastic noise (equal probability bit flips) require an entirely different approach than block noise (e.g. a lightning strike on a line which wipes out an entire section of the transmission) requires something else. For that type of noise SMALL chunks of (redundant) data dispersed over time are actually better than large chunks with error correction as the article mentions.
Asymmetric noise (e.g. where flipping from a 0 to a 1 is more likely than the other way around) favors yet another approach.
In distributed systems you can have any and all of these to a varying (and possibly time variable!) degree.
Once you go into scenarios which include dedicated attack (i.e. your enconding/error correction scheme is known to the attacker and the attack is optimized to produce the most damage with the least amount of change in the message) things get really freaky.
Pattern_chaser
1 / 5 (1) Oct 02, 2014
If the two communicating devices exchange lists, won't the lists become corrupted? The problem is not solved but moved...?
antialias_physorg
5 / 5 (1) Oct 02, 2014
The way I understand the paper (see link at he bottom of the article) they need only make sure that one of the lists exchanged is correct - and it doesn't matter which one.

A coding scheme that could thwart an active adversary would also thwart any type of random noise.

...if I read their paper correctly with the proviso that p remains less than 0.5 (i.e. the Shannon limit isn't violated)...but I'm not through it in its entirety, yet.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.