Research Leads to Self-Improving Chips with Speed 'Warping'

Oct 18, 2007

Imagine owning an automobile that can change its engine to suit your driving needs – when you’re tooling about town, it works like a super-fast sports car; when you’re hauling a heavy load, it operates like a strong, durable truck engine. While this turn-on-a-dime flexibility is impossible for cars to achieve, it is now possible for today’s computer chips.

A new, patent-pending technology developed over the last five years by UCR’s Frank Vahid, Professor of Computer Science and Engineering, called "Warp processing" gives a computer chip the ability to improve its performance over time.

The benefits of Warp processing are just being discovered by the computing industry. A range of companies including IBM, Intel and Motorola’s Freescale have already pursued licenses for the technology through UCR’s funding source, the Semiconductor Research Corporation.

Here’s how Warp processing works: When a program first runs on a microprocessor chip (such as a Pentium), the chip monitors the program to detect its most frequently-executed parts. The microprocessor then automatically tries to move those parts to a special kind of chip called a field-programmable gate array, or FPGA. “An FPGA can execute some (but not all) programs much faster than a microprocessor – 10 times, 100 times, even 1,000 times faster,” explains Vahid.

“If the microprocessor finds that the FPGA is faster for the program part, it automatically moves that part to the FPGA, causing the program execution to ‘warp.’” By performing optimizations at runtime, Warp processors also eliminate tool flow restrictions, as well as the extra designer effort associated with traditional compile-time optimizations.

FPGAs can benefit a wide range of applications, including video and audio processing; encryption and decryption; encoding; compression and decompression; bioinformatics – anything that is compute-intensive and operates on large streams of data. Consumers who want to enhance their photos using Photoshop or edit videos on their desktop computers will find that Warp processing speeds up their systems, while gamers will immediately notice the difference in better graphics and performance. Additionally, embedded systems such as medical instrument or airport security scanners can perform real-time recognition using Warp-enhanced FPGAs.

“Thread Warping: A Framework for Dynamic Synthesis of Thread Accelerators” was named one of the top five papers at the 2007 International Conference on Hardware/Software Codesign and System Synthesis (CODES/ISSS) conference in Austria, and was published among the conference proceedings. “Warp Processing and Just-in-Time FPGA Compilation,” the Ph.D. dissertation of Vahid’s student Roman Lysecky, was named “Dissertation of the Year” by the European Design and Automation Association in 2006.

“When large supercomputers were shrunk down into smaller devices, we didn’t know what they'd be used for,” says Vahid. “And over the years, we’ve seen the emergence of cell phones, MP3 players, smart cars, and intelligent pacemakers – technology that was hard to imagine 20 years ago. FPGAs, coupled with Warp Processing, makes the speedup potential of FPGAs accessible to every computer, whether in a PC, cell phone, or elsewhere. This makes the potential for the future development of brand-new applications – applications that we can’t conceive of now – very exciting.”

Source: University of California, Riverside

Explore further: Researcher develops method for monitoring whether private information is sufficiently protected

add to favorites email to friend print save as pdf

Related Stories

Recommended for you

Tackling urban problems with Big Data

13 hours ago

Paul Waddell, a city planning professor at the University of California, Berkeley, with a penchant for conducting research with what he calls his "big urban data," is putting his work to a real-world test ...

Computer-assisted accelerator design

Apr 22, 2014

Stephen Brooks uses his own custom software tool to fire electron beams into a virtual model of proposed accelerator designs for eRHIC. The goal: Keep the cost down and be sure the beams will circulate in ...

User comments : 3

Adjust slider to filter visible comments by rank

Display comments: newest first

not rated yet Oct 18, 2007

What happens if the system mistakenly thinks it can execute that code faster, and causes an error, or takes a lot longer instead?
not rated yet Oct 19, 2007
"...when you're tooling about town, it works like a super-fast sports car". LOL! Because we all know that's the best time to drive like a nut. HELLO?, anybody else have kids and live in town?!

anyway, this FPGA tech sounds very promising and very common sense. Logical next step after data prefetching.
not rated yet Oct 20, 2007
QC: First of all this computing technology seems to use a learning curve similar to statistical analysis covered by Genetic Algorithms, (GA). Although I'm not proposing GAs be used as such for this, the idea would be similar. Applying a fitness value to functions, unlike GA where this fitness adds to the percentile that it will be chosen for reproduction, a higher fitness would be cause for transfer of these functions to FPGA which would cut the bus, cpu, and memory access times out of each function call, you'd simply send the variables for the function to the fpga.

Let's use Euclid's algorithm for determining the GCD of two positive integers as an example. The code looks like this for a non recursive function
where $r is remainder, $m is the first positive integer, and $n the second.

$r = $m % $n;
while ($r) {
$m = $n;
$n = $r;
$r = $m % $n;

now look at all the operations taking place, for this to execute first you have to send the two variables to a modulus function, then copy the result to $r, then repeat until $r is 0. in assembly (the closest thing you'll see to what the computer is actually doing step by step ) in assembly this translates to what you'd see here http://www.cs.usf...07/gcd.s sure is a lot of stuff to do just for that little bit of code up there isn't it? Now using the FPGA warping they're discussing here, this assembly could do nothing more than two movs to the fpga address space where the two variables are expected and recieve the final return value, leaving out all those other cycles for something else. This is a grand idea and I don't think that a minor flaw in determining the speedup that ends in a reduced speed would be problematic as long as it "learns" that this is not effecient. As it would never be effecient to copy a function that is used but once to FPGA for execution then it must be a repeated function and thus has the ability to do a two run test, first run normally as you would on a standard setup, then copy to fpga and run, if fpga shows to be faster use that, if not revert to the standard execution and remember that this is the most economical method.


More news stories

Genetic code of the deadly tsetse fly unraveled

Mining the genome of the disease-transmitting tsetse fly, researchers have revealed the genetic adaptions that allow it to have such unique biology and transmit disease to both humans and animals.

Ocean microbes display remarkable genetic diversity

The smallest, most abundant marine microbe, Prochlorococcus, is a photosynthetic bacteria species essential to the marine ecosystem. An estimated billion billion billion of the single-cell creatures live i ...