Scientists suggest silicon chips should be allowed to make errors

May 26, 2010 by Lin Edwards weblog
silicon chip

(PhysOrg.com) -- Researchers in the U.S. have discovered allowing silicon chips to make errors could ensure computers continue to become more powerful, while using less energy.

Makers of struggle to squeeze more performance out of chips for the same power, but the latest research findings suggest they could provide greater performance with lower power requirements if the rules governing how chips work were relaxed.

Moore’s Law predicts that the number of transistors (tiny switches) that can fit on a given area of silicon for a given price will double every 18-24 months, and this Law has been followed by reducing the size of transistors, which generally results in more powerful processing. The problem is that as the become smaller their reliability and variability become issues.

Rakesh Kumar, Assistant Professor of at the University of Illinois at Urbana-Champaign, thinks the insistence on making silicon chips operate with no is hastening the end of Moore’s Law and forcing to run the chips at a higher power than necessary just to ensure the chips never make mistakes. A sizeable proportion of chips also have to be rejected if they are less than perfect, and this increases manufacturing costs. The problems are all worsening as the size of components continues to decrease.

In an interview with the BBC Professor Kumar said that instead of insisting on perfect, error-free chips, we should embrace the imperfections to make what he calls “stochastic processors” that will be allowed to make random errors. He pointed out that the hardware is already stochastic, which means it is not flawless, and spending more money to make it appear to be flawless is a waste.

Kumar and colleagues are putting these ideas into practice by designing chips that are not flawless, and then managing the number and type of errors. This can reduce the by up to 30 percent, and the reduction is even 23 percent with error rates of just one percent.

Professor Kumar said the errors will not usually have a significant effect on a computer, but in other cases they could cause a computer to crash. To counteract this, the team is investigating ways to make applications that respond to errors by simply making the execution of instructions take longer, or which use a log of a user’s actions to identify unexpected errors. This research may be useful for making existing applications better able to cope with errors that already occur.

Explore further: Computerized emotion detector

Related Stories

Building a Molecular Computer Chip

Feb 13, 2007

For Dr. Jerry Bernholc, a trip to Oak Ridge National Laboratory (ORNL) is like a suburbanite’s trek to Costco. Bulk consumption is the name of the game.

AMD Planning 16-Core Server Chip For 2011 Release

Apr 27, 2009

(PhysOrg.com) -- AMD is in the process of designing a server chip with up to 16-cores. Code named Interlagos, the server chip will contain between 12 and 16 cores and will be available in 2011.

Japan falls behind in chip manufacturing

May 16, 2006

A university study suggests Japan is falling behind other nations in the production of Dynamic Random Access Memory (DRAM) chips for personal computers.

Recommended for you

Computerized emotion detector

20 hours ago

Face recognition software measures various parameters in a mug shot, such as the distance between the person's eyes, the height from lip to top of their nose and various other metrics and then compares it with photos of people ...

Cutting the cloud computing carbon cost

Sep 12, 2014

Cloud computing involves displacing data storage and processing from the user's computer on to remote servers. It can provide users with more storage space and computing power that they can then access from anywhere in the ...

Teaching computers the nuances of human conversation

Sep 12, 2014

Computer scientists have successfully developed programs to recognize spoken language, as in automated phone systems that respond to voice prompts and voice-activated assistants like Apple's Siri.

Mapping the connections between diverse sets of data

Sep 12, 2014

What is a map? Most often, it's a visual tool used to demonstrate the relationship between multiple places in geographic space. They're useful because you can look at one and very quickly pick up on the general ...

User comments : 30

Adjust slider to filter visible comments by rank

Display comments: newest first

Bob_Kob
May 26, 2010
This comment has been removed by a moderator.
ZeroX
3.5 / 5 (6) May 26, 2010
Such mistake could become more expensive, then the energy saving. And to make application more prone to such error it could mean, they would be a much more slower. Instead of this we should invest into less bulky programs and operating systems.
ZeroX
3 / 5 (2) May 26, 2010
In recent time I recognized tendency to transfer the responsibility of technology producers to future (in accordance to "life on credit" paradigm), or to another members of consumer chains instead of true savings.

For example, so-called biofuels or the production of transgenic plants may appear like big deal and money saving, but under more deeper view we could see, they just dissipate the cost of production into neighboring space-time. When making processors less prone to errors, processor producers just transferred their responsibility for error checking to software developers.

But the society as a whole (which still needs the exact data) will get anything from such "saving".
Skeptic_Heretic
not rated yet May 26, 2010
This is a relaxation of the baseline standards. IE: Primetime would be more akin to discount.

Aside from the power saving costs, what would the end user and consumer economic impacts be? Would we get faster chips for lower prices?

Bonkers
5 / 5 (1) May 26, 2010
Firstly, its mainly an economic argument, the cost of correcting errors vs the savings of lower grade hardware.
Second, its already done - we have error correcting ECC-DRAM, error correcting Flash/eeprom.
the problem is that single bit errors are easy, but a higher error rate introduces many more (by the law of squares) double and triple bit errors - and these are costly to correct, needing more redundancy than was saved.
Thirdly, there are always rare multiple errors that need a systems approach, "can you run that calculation again" - and fixing this is ruinous, at present.
ZeroX
5 / 5 (1) May 26, 2010
Aside from the power saving costs, what would the end user and consumer economic impacts be?
This is the problem of many environmental "saving": in their consequence they could lead to even higher pollution. It's effective to check hardware errors at hardware level, then at software level.

http://www.pcworl...own.html
SincerelyTwo
4.6 / 5 (5) May 26, 2010
There are applications where degrees of error are tolerable, for those this would be fine. This might be a problem for a lot of scientific applications, but not all applications.

People need to stop thinking in terms of this replacing everything, this article never actually states that's the intention.

What this article does tell you is there are methods of increasing battery life by sacrificing quality. That's an objective statement based on and supported by research, how 'acceptable' the consequences are depend on the specific application.

So no, it's not a good or bad thing, it's a depends thing.

Learn to think differently.
Bonkers
not rated yet May 26, 2010
fair point, but there are already low grade memories (test fails) that are used in non-critical applications like voice/telephone recorders. itis hard to guarantee that all users of a memory device can accept errors, so applications are limited - will never be mainstream, i'm all for re-examining, but in this case the regular approach of reduced power and cost at 10^-11 error rate wins. there are newer technologies on the roadmap, multi level cell, memristor arrays, multi-wire crosspoint arrays etc.
ghinckley68
5 / 5 (6) May 26, 2010
AS A SOFTWARE ENG THIS IS JUST A REALLY REALLY REALLY REALLY BAD IDEA. Current software even mission critical stuff rarely if ever checks to see if the answer is correct its hard enough to get most programmers to perform sanity checks(just to make sure the data is with in reasonable bounds). If this were to happen in chips nearly every piece of software out there would have to be scrapped.
CSharpner
5 / 5 (4) May 26, 2010
This can't work for general purpose computing where a "tiny" error like which memory address to send the program counter to would have disastrous effects. Most executable code can't have errors. But, there are some instances that /could/ allow for errors such as live, 3D rendering. Suppose a portion of the screen is rendered incorrectly for 1 frame (1/30th of a second)... that'd probably be acceptable if we could achieve greater speed and/or resolution and/or energy savings. Make that a user controllable settings like anti-aliasing, shadoews, etc.. are now.

Maybe even have error-capable instructions built into a CPU. A programmer could choose which calculations are acceptable for errors in order to gain speed or save on energy and, more importantly, which are NOT.

To be honest, I can think of VERY FEW instances where anything less than 100% accuracy is acceptable in my day-to-day software development.
CSharpner
5 / 5 (3) May 26, 2010
...continued.

In fact, the trend in software architecture is in being less energy efficient such as programs running in virtual environments like the Java runtime or the .NET CLR. This is a trade off for easier programming, more efficient programmers, fewer errors, quicker time to market, to improve the efficiency of the users of the software. The extra energy expended to execute code in a VM is more than offset by these other benefits.

Don't misunderstand though, if we can keep all these benefits AND have reduced energy and faster processing, I'm all for it and I think it could be accomplished in a limited set of uses.
JCincy
not rated yet May 26, 2010
I think you could simulate the same results using Microsoft Vista. :)
Skeptic_Heretic
not rated yet May 26, 2010
In fact, the trend in software architecture is in being less energy efficient such as programs running in virtual environments like the Java runtime or the .NET CLR.
The extra energy expended to execute code in a VM is more than offset by these other benefits.

Oh that's so wrong it's not even funny.

Power savings in virtual environments is calculable. For example, the average IIS server might utilize 10 to 20% of it's power resources, wasting 80 to 90% of those resources in idle time and the same energy use.

So with redundant 500W PS units being the norm, that is 800 to 900W/h wasted per unit.

With virtual or VM environments you can utilize those unused resources and get pretty close to 100% utilization.

Think of it in a datacenter rather than in your home. 10 IIS servers at a total use of 10 KW/h and waste 9 KW/h doing nothing. 1 VM IIS environment uses 1KW/h with a waste of 50-100 Watts/hour. 90% cost savings, not counting hardware, AC, network and etc.
CSharpner
5 / 5 (1) May 26, 2010
Skeptic, you're talking about VMs as in VMWare and Virtual PC and what you say is correct in that context. You misunderstood my use of the acronym "VM" (which is why mentioned Java & .NET to make it clear). I didn't mean it in the context of VMWare or Virtual PC, but in the sense of a Java VM and .NET CLR. VMs (in the context of VMWare et al) definitely help reduce power consumption by requiring fewer servers. But... that's not what I'm talking about. I'm talking about software architected by software developers. Adding two numbers in one of those language VMs (Java VM or .NET CLR) is more CPU intensive than "on the metal" with native CPU instructions (ie. C/C++/Assembly). Microsoft has done speed tests on .NET compared to native code and measured that .NET code runs about 40% slower than native code. Software dev has been moving more and more towards /that/ kind of architecture and /that/ kind of architecture is less energy efficient.

VMWare VMs certainly help though.
Skeptic_Heretic
5 / 5 (1) May 26, 2010
@Csharpner: Ah, ok. In that sense I agree with you. My mistake, too many acronyms in the computer world. I look at those "VMs" as VE's, virtual environments rather than virtual machines, again, just acronym semantics and useless to the conversation at hand. What you posted is factual.
nonoice
5 / 5 (1) May 27, 2010
This kind of chip would be very interesting as a co-processor: have all programming logic on a error free chip and performing physics simulation with its inherent stochasticity on an chip with 'free' random number.
it could also be a good playground for learning how to program stochastic because the next big step will be quantum computing which has a huge error rate compared to this measly 1%.
Valentiinro
1 / 5 (1) May 27, 2010
This kind of chip would be very interesting as a co-processor: have all programming logic on a error free chip and performing physics simulation with its inherent stochasticity on an chip with 'free' random number.
it could also be a good playground for learning how to program stochastic because the next big step will be quantum computing which has a huge error rate compared to this measly 1%.


That's exactly what I was thinking. Everyone's got their dual/quad/lots core computers now, if one was set aside to run efficient but error prone computations it would save some power.
SMMAssociates
not rated yet May 29, 2010
Sure sounds like Zero's guys were involved in this pipe dream....

Balderdash....

Errors are acceptable in things like image processing and storage, but that implies enough intelligence in the process AND hardware to separate stuff like that from things like how my paycheck is created. I don't see any gain there....

Quantum_Conundrum
not rated yet May 29, 2010
This is completely pointless in so many ways.

It's definitely unacceptable in any business application storing or retrieving payroll, accounts recievable or payable, goods shipped or recieved, etc.

It's also bad for simulations.

And it just plain SUCKS even for gaming. Nobody wants an unfair "mistake prone" game engine for a multiplayer game.

It's bad for websites. A one bit mistake pretty much anywhere in the processing of a script file can change a markup code to something completely different, and then your entire web application is garbled, or even corrupts data or output.

Also, given the other article on the new 4nm transistor, the guy's entire premise has been flat out proven wrong. In a year or two when this stuff hits the market, he won't even care about this nonsense. Within 5 years, everyone will own a computer as powerful as a university's best super computer...
JoeDuff
not rated yet May 29, 2010
There are a few examples, where such paradigm could have some sense anyway. For example decoding audio or video is prone to substantial level of errors (the format of audio CD has no internal CRC bytes built in, so what you read from audio CD is what you hear - a few wrong bytes plays no role there at all).

For such kind of applications some 2nd grade processor ought be enough. The more at the case of 3D video decoding, where the speed is preferred over quality.
Sanescience
not rated yet May 30, 2010
Um, yea... I'm going to have to ask you to stop drinking the crazy juice on this one.

There might be a few niche uses for something like that but nothing that would benefit from large scale production to lower cost.
magpies
not rated yet May 30, 2010
Are they joking with this? Why don't I just watch static on a computer screen and pretend im surfing the web...
akotlar
not rated yet May 30, 2010
Are they joking with this?


Basically the reason it is being proposed is because the amount of energy you expend getting extremely high precision could be conserved by using lower, but acceptable precision. There is no possibility of getting infinite accuracy. Computers operate at a maximum (I believe?) of 112 bits of precision, but why do you need it? Since you have to have a rounding error anyways, why not set the limit on errors higher.

Commonly an analogy to the human brain is brought up. Essentially the brain is a very noisy collection of circuits. It produces tons of errors, but if it didn't, if it required very high precision, it would have to operate at much lower frequencies, and spend much effort ensuring every calculation was accurate, via many layers of error correction. This would kill your ability to process many concepts in flight, and would essentially make you unable to function at a high level.
akotlar
not rated yet May 30, 2010
Quantum conundrum
This is completely pointless in so many ways.


Sure, if you can achieve perfect precision with 0 work investment, go ahead. Unfortunately, at very small transistor scales, you have unavoidable quantum noise, which puts a limit on how high you can set your frequency, and creates an exponential rise in heat & power requirements to achieve the desired precision level (~16-34 decimal digits for current processors).

Your processor compensates for the noise by essentially pumping as many electrons through the circuit as you need to get the desired level of precision. Even still, to get that precision you have to have very low frequency (why CPU's have been stuck at
komone
not rated yet May 30, 2010
The balance of your account is probably 743.45
Quantum_Conundrum
not rated yet May 30, 2010
Akottar:

The difference between a human brain and a computer is that with computers we can network a theoretically infinite number of processor cores and RAM to work on time-restricted applications that are either multi-threaded or which can be broken down into multi-threaded, which is to say, graphics, rendering, sensory input, modeling, simulations, etc.

One other thing, "precision" and "accuracy" have very different definitions. An "error" is a matter of "accuracy", while "precision" is controlled for, and if understood by the user, is often acceptable.

Precision is when you say "the distance to such and such an object is Xly +/- Yly. So then you are precise to within Yly.

Accuracy, is a matter of error, and pertains to whether or not you hit the target, for example. If a computer makes an error, then it is inaccurate, and then if this is an aircraft's flight control, people die, for example.

Precision is "oh, my instrument could be off by +/- a millimeter."
Quantum_Conundrum
not rated yet May 30, 2010
The balance of your account is probably 743.45


yeah...

Now where is that error the computer was allowed to make? Should the 7 be a 9? Should the decimal be one place to the right? Is that in dollars, Pesos, or is it Euros?
akotlar
not rated yet May 31, 2010

Precision is "oh, my instrument could be off by +/- a millimeter."


I'm not sure I see the difference between the brain and mutli-threaded applications. The brain is far more parallel than any classical computer could every hope to be. Each neuron processes information * 100 billion neurons + calculations at the 100 trillion synapses and in glial cells (various theories on this)

Precision and error are only distinguished by the latter exceeding the limit on errors. Arbitrary unless you speak to application.

You could increase precision by using some sort of software error correction or some precision extension algorithm where necessary. Today's programs do so already, but obviously taking into account the lower tolerance for error of today's cpus. You'll need to evaluate whether the benefits in power & frequency are outweighed by the extra cost for guaranteeing precision for simulation or whatever you need it for.
akotlar
not rated yet May 31, 2010
Meant to say precision & program error is distinguished by the latter being an exceedance of acceptable error (which is a measure of precision).

The genius (imo maybe) of stochastic processors is that you achieve an exponential reduction in power usage compared to loss of precision bits. All of that saved power can be converted to exponentially more processing power, either through more cores in the same heat/power envelope, or higher frequency.

It seems ridiculous to me that anyone could see the loss of precision as being a bad thing in this light unless you're folding proteins or doing other HPC stuff, and in those cases it may still be worth it.
magpies
not rated yet May 31, 2010
Ya your right and the first systems that should get these new processors should be the ones that control nukes or your bank account.
SincerelyTwo
not rated yet Jun 03, 2010
magpies,

Why should those systems get those processors first? I think that's a very bad idea if you asked me, I think a better applications would be neural networks which can build in self-correcting mechanisms. Hell even image and signal analysis processors designed around lossless compression could work - the human eye is capable of still understanding poor audio and seeing poor images. In military and telecom industries these kinds of processors could do perfectly, maybe even better, in space where signals need to be compressed but battery life is critical.

People like you always need help with thinking, and when you run in to brick walls you become sarcastic and dumb. :]