Process variation threatens to slow down and even pause chip miniaturization

January 5, 2016 by Sparsh Mittal
Plot of CPU transistor counts against dates of introduction; note the logarithmic vertical scale; the line corresponds to exponential growth with transistor count doubling every two years. Credit: Wikipedia

For past several decades, the processor industry has enjoyed the benefits of chip miniaturization and the exponential increase in the number of on-chip transistors as predicted by Moore's law. However, as process technology scales to small feature sizes, precise control of fabrication processes has become increasingly difficult. As a result, 'process variation' (PV), which refers to the deviation in parameters from their nominal specifications, has greatly exacerbated.

Above the nearly 350nm technology node, PV had negligible effect on processors, since the magnitude of variation was insignificant compared to the device size. However, with ongoing process scaling, the effect of PV can be seen on all metrics of interest, such as performance, energy and yield.

For example, due to PV, the maximum clock frequency of different cores in a 65nm, 80-core Intel processor can vary between 5.7 GHz and 7.3 GHz. Similarly, due to PV, the timing parameters in a DDR3 DRAM device can be up to 66 percent lower than the datasheet specifications. PV can lead to as high as 9X variation in the sleep power in different instances of ARM Cortex M3 processors. In (PCM), the write endurance of different cells can vary by up to 50X due to PV.

The effect of PV also increases at low voltages and as the supply voltage continues to scale with process scaling (e.g., from 5v at 800nm to ~1.1v at 32nm ) or as voltage-scaling approaches become deployed for saving energy, the effect of PV is expected to worsen. In fact, a study reports chip yields reducing from nearly 90 percent at 350nm to 50 percent at the 90nm feature size. It has been estimated that if left unaddressed, PV can wipe out the performance gain obtained from an entire process technology generation.

These points are highlighted in a recent survey paper titled, "A Survey Of Architectural Techniques for Managing Process Variation" by ORNL researcher Sparsh Mittal. This paper, accepted in ACM Computing Surveys 2015, investigates the impact of PV along with strategies for mitigating it in a wide range of system architectures, e.g. in CPUs, GPUs, in processor components (cache, main memory, processor core), in memory technologies (SRAM, DRAM, eDRAM, non-volatile memories e.g. PCM, resistive RAM) and in both 2D and 3D processors.

The paper also summarizes some commonly used system-level techniques for managing process variation, such as task scheduling, DVFS, use of redundant storage, etc. For example, in multicore processors, the tasks can be scheduled to a core which is least affected by PV. Similarly, higher supply voltage or additional refresh operations can be provisioned for a block most affected by PV. Further, PV-affected parts (e.g. registers or cache blocks) can be disabled and normal or spare parts can instead be used. Also, the faults in PV-affected parts can be corrected by using error-correcting codes (ECC). These techniques have shown significant potential in alleviating the impact of PV on processors.

As the quest of ongoing process scaling confronts the formidable challenge of rising process variation, the design of computing systems is likely to undergo a major overhaul. Crossing over these obstacles for designing variation-resilient computing systems is the challenge that awaits us in near future.

Explore further: Multiple repeat procedures seem beneficial in A-fib recurrence

More information: A Survey Of Architectural Techniques for Managing Process Variation: … ng_Process_Variation

Related Stories

Regular dusting bolsters solar panel performance

September 23, 2015

Perth residents who are the proud custodians of solar panels could boost the amount of power that the arrays produce over an extended period of time by simply removing dust particles from the panels.

Detecting defects in solar cells

December 2, 2014

Scientists at the National Physical Laboratory (NPL) have developed a new method for detecting defects in solar cells using a technique called 'compressed sensing'.

Recommended for you

New technique spots warning signs of extreme events

September 22, 2017

Many extreme events—from a rogue wave that rises up from calm waters, to an instability inside a gas turbine, to the sudden extinction of a previously hardy wildlife species—seem to occur without warning. It's often impossible ...

1 comment

Adjust slider to filter visible comments by rank

Display comments: newest first

not rated yet Jan 05, 2016
The article's graph is a false presentation of Moore's law because it pits differently priced processors against each other. You can always make a larger chip with more transistors - you simply pay more for it.

Moore's law is about the largest number of transistors on a single chip at the optimal price per transistor. If you pick similiarily priced processor - or rather the most sold best value chips - from 1970 to 2015, the exponential curve vanishes about half-way in.

Every single time Moore's law is mentioned, they get it wrong.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.