The faster-than-fast Fourier transform

Jan 18, 2012 by Larry Hardesty
Graphic: Christine Daniloff

The Fourier transform is one of the most fundamental concepts in the information sciences. It’s a method for representing an irregular signal — such as the voltage fluctuations in the wire that connects an MP3 player to a loudspeaker — as a combination of pure frequencies. It’s universal in signal processing, but it can also be used to compress image and audio files, solve differential equations and price stock options, among other things.

The reason the Fourier transform is so prevalent is an algorithm called the fast Fourier transform (FFT), devised in the mid-1960s, which made it practical to calculate Fourier transforms on the fly. Ever since the FFT was proposed, however, people have wondered whether an even faster algorithm could be found.

At the Association for Computing Machinery’s Symposium on Discrete Algorithms (SODA) this week, a group of MIT researchers will present a new algorithm that, in a large range of practically important cases, improves on the fast Fourier transform. Under some circumstances, the improvement can be dramatic — a tenfold increase in speed. The new algorithm could be particularly useful for image compression, enabling, say, smartphones to wirelessly transmit large video files without draining their batteries or consuming their monthly bandwidth allotments.

Like the FFT, the new algorithm works on digital signals. A digital signal is just a series of numbers — discrete samples of an analog signal, such as the sound of a musical instrument. The FFT takes a digital signal containing a certain number of samples and expresses it as the weighted sum of an equivalent number of frequencies.

“Weighted” means that some of those frequencies count more toward the total than others. Indeed, many of the frequencies may have such low weights that they can be safely disregarded. That’s why the Fourier transform is useful for compression. An eight-by-eight block of pixels can be thought of as a 64-sample signal, and thus as the sum of 64 different frequencies. But as the researchers point out in their new paper, empirical studies show that on average, 57 of those frequencies can be discarded with minimal loss of image quality.

Heavyweight division

Signals whose Fourier transforms include a relatively small number of heavily weighted frequencies are called “sparse.” The new algorithm determines the weights of a signal’s most heavily weighted frequencies; the sparser the signal, the greater the speedup the algorithm provides. Indeed, if the signal is sparse enough, the algorithm can simply sample it randomly rather than reading it in its entirety.

“In nature, most of the normal signals are sparse,” says Dina Katabi, one of the developers of the new algorithm. Consider, for instance, a recording of a piece of chamber music: The composite signal consists of only a few instruments each playing only one note at a time. A recording, on the other hand, of all possible instruments each playing all possible notes at once wouldn’t be sparse — but neither would it be a signal that anyone cares about.

The new algorithm — which associate professor Katabi and professor Piotr Indyk, both of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), developed together with their students Eric Price and Haitham Hassanieh — relies on two key ideas. The first is to divide a signal into narrower slices of bandwidth, sized so that a slice will generally contain only one frequency with a heavy weight. 

In signal processing, the basic tool for isolating particular frequencies is a filter. But filters tend to have blurry boundaries: One range of frequencies will pass through the filter more or less intact; frequencies just outside that range will be somewhat attenuated; frequencies outside that range will be attenuated still more; and so on, until you reach the frequencies that are filtered out almost perfectly.

If it so happens that the one frequency with a heavy weight is at the edge of the filter, however, it could end up so attenuated that it can’t be identified. So the researchers’ first contribution was to find a computationally efficient way to combine filters so that they overlap, ensuring that no frequencies inside the target range will be unduly attenuated, but that the boundaries between slices of spectrum are still fairly sharp.

Zeroing in

Once they’ve isolated a slice of spectrum, however, the researchers still have to identify the most heavily weighted frequency in that slice. In the SODA paper, they do this by repeatedly cutting the slice of spectrum into smaller pieces and keeping only those in which most of the signal power is concentrated. But in an as-yet-unpublished paper, they describe a much more efficient technique, which borrows a signal-processing strategy from 4G cellular networks. Frequencies are generally represented as up-and-down squiggles, but they can also be though of as oscillations; by sampling the same slice of bandwidth at different times, the researchers can determine where the dominant frequency is in its oscillatory cycle.

Two University of Michigan researchers — Anna Gilbert, a professor of mathematics, and Martin Strauss, an associate professor of mathematics and of electrical engineering and computer science — had previously proposed an algorithm that improved on the FFT for very sparse signals. “Some of the previous work, including my own with Anna Gilbert and so on, would improve upon the fast algorithm, but only if the sparsity k” — the number of heavily weighted frequencies — “was considerably smaller than the input size n,” Strauss says. The MIT researchers’ algorithm, however, “greatly expands the number of circumstances where one can beat the traditional FFT,” Strauss says. “Even if that number k is starting to get close to n — to all of them being important — this still gives some improvement over FFT.”

Explore further: Professor quantifies how 'one thing leads to another'

Related Stories

Explained: The Discrete Fourier Transform

Nov 25, 2009

(PhysOrg.com) -- In 1811, Joseph Fourier, the 43-year-old prefect of the French district of Isčre, entered a competition in heat research sponsored by the French Academy of Sciences. The paper he submitted ...

Unraveling malaria's genetic mysteries

Dec 22, 2011

(PhysOrg.com) -- Simon Fraser University researchers in biology and computing sciences are starting to piece together a picture that may help scientists and doctors save more than a million lives annually.

Even unconsciously, sound helps us see

Dec 02, 2011

“Imagine you are playing ping-pong with a friend. Your friend makes a serve. Information about where and when the ball hit the table is provided by both vision and hearing. Scientists have believed that each of the ...

Recommended for you

Dinosaur footprints set for public display in Utah

Aug 22, 2014

A dry wash full of 112-million-year-old dinosaur tracks that include an ankylosaurus, dromaeosaurus and a menacing ancestor of the Tyrannosaurus rex, is set to open to the public this fall in Utah.

Fossil arthropod went on the hunt for its prey

Aug 22, 2014

A new species of carnivorous crustacean has been identified, which roamed the seas 435 million years ago, grasping its prey with spiny limbs before devouring it. The fossil is described and details of its lifestyle are published ...

User comments : 20

Adjust slider to filter visible comments by rank

Display comments: newest first

Tausch
not rated yet Jan 18, 2012
Good news for synthesizing the sound of any instrument to the point of being indistinguishable from the original instruments heard by humans. Arbitrary precision compromised lesser still by digital temporal restrictions on computation.

From the dawn of humankind we learned of 'shape' or 'form' from which a sound can originate. From nature - wind, water, fire - and animals to any objects or instruments of humankind.

Fortunately,(or not), the synthesis of sounds are no longer confined to an original shapes or forms of matter. Energy has infinite form, of which at least one of those forms will have the same informational acoustical content as a different form or shape of matter producing the same acoustical content.

Life forms, utilizing any acoustics to convey themselves in any way, are susceptible to what we are able to synthesize acoustically. Ourselves included.

Only one last, greatest constraint remains to be overcome:
The meaning contributed to the sounds heard by life.
Kedas
not rated yet Jan 18, 2012
Can this be used to improve the speed of the h.264 DCT method (video encoding)?
nothingness
not rated yet Jan 18, 2012
Fuzzy Faster-than-Fast Fourier Transform!
Isaacsname
not rated yet Jan 18, 2012
Funny, I went to look up some things on Wiki and figured out an easy way around the block.

:O
Isaacsname
2.3 / 5 (3) Jan 18, 2012
..haha...I was so excited I forgot to tell you.

Type the relevant word into a search engine,

Click on the Wikipedia link,

IMMEDIATELY after, within 1-2 seconds, hit the icon on your navigator bar that stops loading the page, ( esc works too I think ) normally, the Wiki page you want will show briefly before giving you the blackout page, stop loading the page and you are good to go.

Oh yes, I would like a beer, thanks.
Deathclock
5 / 5 (5) Jan 18, 2012
..haha...I was so excited I forgot to tell you.

Type the relevant word into a search engine,

Click on the Wikipedia link,

IMMEDIATELY after, within 1-2 seconds, hit the icon on your navigator bar that stops loading the page, ( esc works too I think ) normally, the Wiki page you want will show briefly before giving you the blackout page, stop loading the page and you are good to go.

Oh yes, I would like a beer, thanks.


Great, but you're missing the point of the protest... if too few people care about this your trick won't work when Wikipedia shuts down for good.
Isaacsname
1.3 / 5 (15) Jan 18, 2012
Pfft, whatever, the internet is a hydra, even if Wiki went down, there'd be multiple parties working on things to replace it. $$$$$, young man, rules the world.

Honestly, for something that is such a big deal, and such a huge website, they took a real half-assed approach if a moron like yours truly can figure it out in under 5 seconds.

What the hell kind of statement for the world is that ?

This is all bullshit, like 99% of everything in the media.

The same handful of companies that made and distributed pretty much ALL file-sharing software are the same ones that are behind SOPA.

That's like selling you dope, then busting you for it.

axemaster
1 / 5 (4) Jan 18, 2012
So in other words, this new technique is not better, it's just higher compression through some "fudging" methods.
Deathclock
4.7 / 5 (15) Jan 18, 2012
Pfft, whatever, the internet is a hydra, even if Wiki went down, there'd be multiple parties working on things to replace it. $$$$$, young man, rules the world.


You realize Wikipedia is not for profit right?

So your "$$$ rules the world" bit does not apply, at all.
Deathclock
5 / 5 (3) Jan 18, 2012
So in other words, this new technique is not better, it's just higher compression through some "fudging" methods.


They also said it's up to 10x faster... so higher compression and faster... what else do you want in order to label it "better"?

The concept of FT's will never change, the quality of the implementation is measured by it's accuracy and it's speed...
Deathclock
5 / 5 (7) Jan 18, 2012
The same handful of companies that made and distributed pretty much ALL file-sharing software are the same ones that are behind SOPA.


Glad I'm not allergic to nuts or I would be in trouble right now.

No one uses "file sharing software" anymore, that is not the issue. You don't understand the issue, you don't understand what SOPA does because I am sure you haven't read it.

Basically SOPA, and to a lesser extent PIPA, make website owners legally liable for the content posted by their users. This will shut down websites that allow massive user contributions, such as youtube, facebook, wikipedia, reddit, twitter, etc... because the owners/operators of those websites do not want to face jail time if one of their users uploads material that is deemed inappropriate by these bills.
Isaacsname
1.2 / 5 (5) Jan 18, 2012
The same handful of companies that made and distributed pretty much ALL file-sharing software are the same ones that are behind SOPA.


Glad I'm not allergic to nuts or I would be in trouble right now.

No one uses "file sharing software" anymore, that is not the issue. You don't understand the issue, you don't understand what SOPA does because I am sure you haven't read it.

Basically SOPA, and to a lesser extent PIPA, make website owners legally liable for the content posted by their users. This will shut down websites that allow massive user contributions, such as youtube, facebook, wikipedia, reddit, twitter, etc... because the owners/operators of those websites do not want to face jail time if one of their users uploads material that is deemed inappropriate by these bills.


That's nice Hourglass, but the bill was killed, nobody will shut anything down.
GSwift7
3 / 5 (2) Jan 18, 2012
Can this be used to improve the speed of the h.264 DCT method (video encoding)?


Yes and no. The .264 encoding is exactly the type of encoding they were trying to improve upon, so yes, this improves it. But, this is more like an entirely different method, rather than an improvement to the old one. This might be more like an MP5 or MPEG-5 format. Your television would probably still need to convert it into the .264 standard before you could watch it, just like they do when you watch an MPEG file on your TV.

Interestingly enough, I've heard that they don't actually decode the MPEG and re-encode it. Rather, they are able to directly go from one encoding to another without much signal loss. Using the above method shouldn't change that much because the conversion basically treats the MPEG as an analog wave form anyway. If I understand it correctly.
GSwift7
4.4 / 5 (7) Jan 18, 2012
So in other words, this new technique is not better, it's just higher compression through some "fudging" methods.

They also said it's up to 10x faster... so higher compression and faster... what else do you want in order to label it "better"?

The concept of FT's will never change, the quality of the implementation is measured by it's accuracy and it's speed


Yeah, that's right.

And it's a force multiplier for every other improvement in the entire network. If you upgrade your cell tower to handle 2x the load, then you use this method to compress the data by a factor of 10, you end up with 20x your original capacity.

A big improvement over FFT is a huge improvement with far-reaching effects. Many people don't understand, but this is a landmark event. If the claims above can be realized in practical application, then these people are sure to get some recognition (Nobel prize maybe?). It's worth a lot of money if what they claim is true.
GSwift7
4.4 / 5 (9) Jan 18, 2012
One more thing:

I want to say "good job" to Physorg on this article. There aren't very many articles on this site with a link to the source paper. I am also happy that the research paper is available without a paywall, but that's the kind of quality you can count on from MIT. MIT is da BOMB.
tadchem
5 / 5 (2) Jan 18, 2012
Call it a 'fudge' if you will; the point is that the job can be done more quickly and with less demand on resources than the current technique require. Having worked with FFT, Kalman filters, Hadamard transforms, etc., I assure you that every scrap of an advantage is worth fighting for.
NicoleTedesco
1 / 5 (1) Jan 18, 2012
Since this FFT avoids work, I call it the "LFFT", or Lazy Fast Fourier Transform!
Tausch
not rated yet Jan 18, 2012
G7/Kedas
Can this be used to improve the speed...?

Yes.
...without much signal loss

Dunno. Dunno what's acceptable or tolerable loss compare to original FDCTs in use.
GSwift7
1 / 5 (2) Jan 19, 2012
Dunno. Dunno what's acceptable or tolerable loss compare to original FDCTs in use


That's true. It's entirely dependent on the application. If you try to compress an Excel spreadsheet with something like this you would end up with a mess. For human voice conversation it would likely be perfect. For music and video you should be able to tune the coding so that it would give results similar to standard FFT for those applications. Youtube already uses something like this and the quality is acceptable, for example.

That's a good point though. This isn't really something that you could install in the hubs on the internet. It would only be applied at the origin and destination points on a case by case basis, and everyone you wanted to connect with would need to have a device capable of encoding/decoding on the same standard, just like how you need an MP3 player to play an MP3.
Isaacsname
not rated yet Jan 19, 2012