Research Leads to Self-Improving Chips with Speed 'Warping'

Oct 18, 2007

Imagine owning an automobile that can change its engine to suit your driving needs – when you’re tooling about town, it works like a super-fast sports car; when you’re hauling a heavy load, it operates like a strong, durable truck engine. While this turn-on-a-dime flexibility is impossible for cars to achieve, it is now possible for today’s computer chips.

A new, patent-pending technology developed over the last five years by UCR’s Frank Vahid, Professor of Computer Science and Engineering, called "Warp processing" gives a computer chip the ability to improve its performance over time.

The benefits of Warp processing are just being discovered by the computing industry. A range of companies including IBM, Intel and Motorola’s Freescale have already pursued licenses for the technology through UCR’s funding source, the Semiconductor Research Corporation.

Here’s how Warp processing works: When a program first runs on a microprocessor chip (such as a Pentium), the chip monitors the program to detect its most frequently-executed parts. The microprocessor then automatically tries to move those parts to a special kind of chip called a field-programmable gate array, or FPGA. “An FPGA can execute some (but not all) programs much faster than a microprocessor – 10 times, 100 times, even 1,000 times faster,” explains Vahid.

“If the microprocessor finds that the FPGA is faster for the program part, it automatically moves that part to the FPGA, causing the program execution to ‘warp.’” By performing optimizations at runtime, Warp processors also eliminate tool flow restrictions, as well as the extra designer effort associated with traditional compile-time optimizations.

FPGAs can benefit a wide range of applications, including video and audio processing; encryption and decryption; encoding; compression and decompression; bioinformatics – anything that is compute-intensive and operates on large streams of data. Consumers who want to enhance their photos using Photoshop or edit videos on their desktop computers will find that Warp processing speeds up their systems, while gamers will immediately notice the difference in better graphics and performance. Additionally, embedded systems such as medical instrument or airport security scanners can perform real-time recognition using Warp-enhanced FPGAs.

“Thread Warping: A Framework for Dynamic Synthesis of Thread Accelerators” was named one of the top five papers at the 2007 International Conference on Hardware/Software Codesign and System Synthesis (CODES/ISSS) conference in Austria, and was published among the conference proceedings. “Warp Processing and Just-in-Time FPGA Compilation,” the Ph.D. dissertation of Vahid’s student Roman Lysecky, was named “Dissertation of the Year” by the European Design and Automation Association in 2006.

“When large supercomputers were shrunk down into smaller devices, we didn’t know what they'd be used for,” says Vahid. “And over the years, we’ve seen the emergence of cell phones, MP3 players, smart cars, and intelligent pacemakers – technology that was hard to imagine 20 years ago. FPGAs, coupled with Warp Processing, makes the speedup potential of FPGAs accessible to every computer, whether in a PC, cell phone, or elsewhere. This makes the potential for the future development of brand-new applications – applications that we can’t conceive of now – very exciting.”

Source: University of California, Riverside

Explore further: Ant behavior might shed insight on problems facing electronics design

add to favorites email to friend print save as pdf

Related Stories

Recommended for you

Saving lots of computing capacity with a new algorithm

Oct 29, 2014

The control of modern infrastructure such as intelligent power grids needs lots of computing capacity. Scientists of the Interdisciplinary Centre for Security, Reliability and Trust (SnT) at the University of Luxembourg have ...

User comments : 3

Adjust slider to filter visible comments by rank

Display comments: newest first

Quantum_Conundrum
not rated yet Oct 18, 2007
Interesting.

What happens if the system mistakenly thinks it can execute that code faster, and causes an error, or takes a lot longer instead?
saucerfreak2012
not rated yet Oct 19, 2007
"...when you're tooling about town, it works like a super-fast sports car". LOL! Because we all know that's the best time to drive like a nut. HELLO?, anybody else have kids and live in town?!

anyway, this FPGA tech sounds very promising and very common sense. Logical next step after data prefetching.
robf
not rated yet Oct 20, 2007
QC: First of all this computing technology seems to use a learning curve similar to statistical analysis covered by Genetic Algorithms, (GA). Although I'm not proposing GAs be used as such for this, the idea would be similar. Applying a fitness value to functions, unlike GA where this fitness adds to the percentile that it will be chosen for reproduction, a higher fitness would be cause for transfer of these functions to FPGA which would cut the bus, cpu, and memory access times out of each function call, you'd simply send the variables for the function to the fpga.

Let's use Euclid's algorithm for determining the GCD of two positive integers as an example. The code looks like this for a non recursive function
where $r is remainder, $m is the first positive integer, and $n the second.

$r = $m % $n;
while ($r) {
$m = $n;
$n = $r;
$r = $m % $n;
}

now look at all the operations taking place, for this to execute first you have to send the two variables to a modulus function, then copy the result to $r, then repeat until $r is 0. in assembly (the closest thing you'll see to what the computer is actually doing step by step ) in assembly this translates to what you'd see here http://www.cs.usf...07/gcd.s sure is a lot of stuff to do just for that little bit of code up there isn't it? Now using the FPGA warping they're discussing here, this assembly could do nothing more than two movs to the fpga address space where the two variables are expected and recieve the final return value, leaving out all those other cycles for something else. This is a grand idea and I don't think that a minor flaw in determining the speedup that ends in a reduced speed would be problematic as long as it "learns" that this is not effecient. As it would never be effecient to copy a function that is used but once to FPGA for execution then it must be a repeated function and thus has the ability to do a two run test, first run normally as you would on a standard setup, then copy to fpga and run, if fpga shows to be faster use that, if not revert to the standard execution and remember that this is the most economical method.

robf

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.