CRISP presents self-repairing chip

CRISP presents self-repairing chip
( -- Can defective chips be reused? An EU-funded team of scientists says they can.

The CRISP ('Cutting edge reconfigurable ICS [Integrated Circuit Systems] for stream processing') project, which clinched EUR 2.8 million under the 'Information and communication technologies' (ICT) Theme of the EU's Seventh Framework Program (FP7), developed a new technique designed to take advantage of the natural redundancy in multicore designs so as to enable the use of reconfigurable cores and resource management during the program's lifecycle phase.

Scientists are aware that a number of defects, like failing to run memory operations, will make a of no use. But designing chips that are error-free is not very viable. So developing fault-tolerant architectures that work together with mechanisms able to detect and fix errors, or ease their effect, could help circuit designers to use defective chips instead of throwing them in the bin.

Chips are vulnerable because of manufacturing defects, environmental disturbances that play havoc on production, and the effects of ageing.

Some experts believe that salvaging the chips would be very beneficial, in addition to using spare resources and implementing an error detection, recovery and repair technique. Doing so would give their reliability a boost, make them more true to form and even usable if faulty.

The CRISP project partners recently presented a self-testing, self-repairing nine-core chip at the recent DATE 2011 conference in France, showing how the natural redundancy in multicore designs can be used thanks to CRISP's technique to use dynamically reconfigurable cores and resource management.

"A key innovation is the Dependability Manager, a test generation unit which accesses the built-in, self-test scan chain to effectively perform production testing at run time," New Electronics quotes Gerard Rauwerda of the Dutch-based Recore Systems, which coordinated the CRISP project, as saying at the DATE 2011 event. "This determines which cores are working correctly." The project partners developed an IP (internet protocol) 'wrapper' around Recore's reconfigurable dsp [digital signal processing] core.'

Adding multiplexers gives the software the means to switch from functional mode to diagnosis mode in order to detect faults. 'There are some timing issues to consider, as the circuitry is running at, say, 200 megahertz (MHz) online, instead of 25MHz offline,' Mr Rauwerda says.

Once analysis of the device is complete, the run time resource manager reroutes tasks to error-free parts of the chip. So the chip is repaired and can continue running.

Mr. Rauwerda points out that the technique could be applied to various cores. At present, the project's approach is to identify unusable faulty cores, and to determine if the core's memory can still be used. "In the future, the aim is to diagnose to a deeper level, to see if we can use more parts of a faulty core," he explains. "A fault tolerant interconnect is going to be very important. We will need to insert test structures into the network on chip interconnect IP for better diagnosis."

Explore further

New bandwidth management techniques boost operating efficiency in multi-core chips

More information:
Provided by CORDIS
Citation: CRISP presents self-repairing chip (2011, June 8) retrieved 22 May 2019 from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Feedback to editors

User comments

Jun 08, 2011
all very nice for High Rel systems, but not so practical in a commercial envirnoment for a number of reasons.
For cost I would not put more cores in than I need, if adding a core adds 10% to the packaged chip cost then it only reduces cost if the yield was below 90% after packaging, and assuming that 100% of the fault tolerant chips were usable.
When we have to think carefully about every 0.1 square mm of silicon this is not an option.
I'd love to see some analysis of the cost impact of this versus the impact on speed / power / overall yield.

Jun 08, 2011
Remember that eg AMD made a lot of money selling 'three-core' chips after testing showed that one of a quad was bad. Also, knowing that a multi-core chip won't just 'give up' when one core glitches would be very good news to server and render farms, and other 'mission critical' hardware...

Jun 08, 2011
I don't doubt the benefits of such technologies for fault tolerant systems, and similar ideas are used in magnetic disks where redundant sectors are used to map out bad sectors making the disk appear perfect ( until it runs out of spare sectors but you should replace it long before that happens ).
My comment is that the benefits of this for most commercial devices would not offset the additional cost of having spare resources. I simply disagree with the sentiment of some of the claims which implies that this will make it possible to design chips with an expectation that they will have faults that can be worked around by redundancy.
As a silicon chip architect I would never condone shipping a product that had failed production test partly in a logic domain, it's a bit different for memory ( flash / dram cells etc) but when an error occurs in logic you do not know the cause without physical examination so the chance of a device then failing completely in the field is greatly increased.

Jun 25, 2011
Thank you for your good post.I was seeking this kind of post.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more