CRISP presents self-repairing chip
The CRISP ('Cutting edge reconfigurable ICS [Integrated Circuit Systems] for stream processing') project, which clinched EUR 2.8 million under the 'Information and communication technologies' (ICT) Theme of the EU's Seventh Framework Program (FP7), developed a new technique designed to take advantage of the natural redundancy in multicore designs so as to enable the use of reconfigurable cores and resource management during the program's lifecycle phase.
Scientists are aware that a number of defects, like failing to run memory operations, will make a core of no use. But designing chips that are error-free is not very viable. So developing fault-tolerant architectures that work together with mechanisms able to detect and fix errors, or ease their effect, could help circuit designers to use defective chips instead of throwing them in the bin.
Chips are vulnerable because of manufacturing defects, environmental disturbances that play havoc on production, and the effects of ageing.
Some experts believe that salvaging the chips would be very beneficial, in addition to using spare resources and implementing an error detection, recovery and repair technique. Doing so would give their reliability a boost, make them more true to form and even usable if faulty.
The CRISP project partners recently presented a self-testing, self-repairing nine-core chip at the recent DATE 2011 conference in France, showing how the natural redundancy in multicore designs can be used thanks to CRISP's technique to use dynamically reconfigurable cores and resource management.
"A key innovation is the Dependability Manager, a test generation unit which accesses the built-in, self-test scan chain to effectively perform production testing at run time," New Electronics quotes Gerard Rauwerda of the Dutch-based Recore Systems, which coordinated the CRISP project, as saying at the DATE 2011 event. "This determines which cores are working correctly." The project partners developed an IP (internet protocol) 'wrapper' around Recore's reconfigurable dsp [digital signal processing] core.'
Adding multiplexers gives the software the means to switch from functional mode to diagnosis mode in order to detect faults. 'There are some timing issues to consider, as the circuitry is running at, say, 200 megahertz (MHz) online, instead of 25MHz offline,' Mr Rauwerda says.
Once analysis of the device is complete, the run time resource manager reroutes tasks to error-free parts of the chip. So the chip is repaired and can continue running.
Mr. Rauwerda points out that the technique could be applied to various cores. At present, the project's approach is to identify unusable faulty cores, and to determine if the core's memory can still be used. "In the future, the aim is to diagnose to a deeper level, to see if we can use more parts of a faulty core," he explains. "A fault tolerant interconnect is going to be very important. We will need to insert test structures into the network on chip interconnect IP for better diagnosis."