CRISP presents self-repairing chip
(PhysOrg.com) -- Can defective chips be reused? An EU-funded team of scientists says they can.
The CRISP ('Cutting edge reconfigurable ICS [Integrated Circuit Systems] for stream processing') project, which clinched EUR 2.8 million under the 'Information and communication technologies' (ICT) Theme of the EU's Seventh Framework Program (FP7), developed a new technique designed to take advantage of the natural redundancy in multicore designs so as to enable the use of reconfigurable cores and resource management during the program's lifecycle phase.
Scientists are aware that a number of defects, like failing to run memory operations, will make a core of no use. But designing chips that are error-free is not very viable. So developing fault-tolerant architectures that work together with mechanisms able to detect and fix errors, or ease their effect, could help circuit designers to use defective chips instead of throwing them in the bin.
Chips are vulnerable because of manufacturing defects, environmental disturbances that play havoc on production, and the effects of ageing.
Some experts believe that salvaging the chips would be very beneficial, in addition to using spare resources and implementing an error detection, recovery and repair technique. Doing so would give their reliability a boost, make them more true to form and even usable if faulty.
The CRISP project partners recently presented a self-testing, self-repairing nine-core chip at the recent DATE 2011 conference in France, showing how the natural redundancy in multicore designs can be used thanks to CRISP's technique to use dynamically reconfigurable cores and resource management.
"A key innovation is the Dependability Manager, a test generation unit which accesses the built-in, self-test scan chain to effectively perform production testing at run time," New Electronics quotes Gerard Rauwerda of the Dutch-based Recore Systems, which coordinated the CRISP project, as saying at the DATE 2011 event. "This determines which cores are working correctly." The project partners developed an IP (internet protocol) 'wrapper' around Recore's reconfigurable dsp [digital signal processing] core.'
Adding multiplexers gives the software the means to switch from functional mode to diagnosis mode in order to detect faults. 'There are some timing issues to consider, as the circuitry is running at, say, 200 megahertz (MHz) online, instead of 25MHz offline,' Mr Rauwerda says.
Once analysis of the device is complete, the run time resource manager reroutes tasks to error-free parts of the chip. So the chip is repaired and can continue running.
Mr. Rauwerda points out that the technique could be applied to various cores. At present, the project's approach is to identify unusable faulty cores, and to determine if the core's memory can still be used. "In the future, the aim is to diagnose to a deeper level, to see if we can use more parts of a faulty core," he explains. "A fault tolerant interconnect is going to be very important. We will need to insert test structures into the network on chip interconnect IP for better diagnosis."
More information: http://www.crisp-project.eu/
Provided by CORDIS
-
From lemons to lemonade: Reaction uses carbon dioxide to make carbon-based semiconductor,
32 comments
-
Thioridazine kills cancer stem cells in human while avoiding toxic side-effects of conventional cancer treatments,
3 comments
-
SpaceX private rocket blasts off for space station (Update),
42 comments
-
Climate scientists say they have solved riddle of rising sea,
31 comments
-
SpaceX capsule has 'new car' smell, astronauts say (Update),
2 comments
-
Ideas to mitigate risk of 911 calls being misdirected
May 24, 2012
-
Live scribe pen?
May 10, 2012
-
Shallow water flow simulation
May 07, 2012
-
Tablet for taking notes?
May 05, 2012
-
Best fit tablet for me?
May 05, 2012
-
Measure of Informaton
May 04, 2012
- More from Physics Forums - Computing & Technology
More news stories
Browser wars flare in mobile space
The browser wars are heating up again, but this time the fight is for dominance of the mobile Internet.
4 hours ago |
5 / 5 (1) |
2
Probability of contamination from severe nuclear reactor accidents is higher than expected: study
Catastrophic nuclear accidents such as the core meltdowns in Chernobyl and Fukushima are more likely to happen than previously assumed. Based on the operating hours of all civil nuclear reactors and the number ...
Technology / Energy & Green Tech
May 22, 2012 |
3.6 / 5 (22) |
56
|
SpotterRF debuts Radar Backpack Kit (w/ Video)
(Phys.org) -- SpotterRF has announced a special radar backpack kit designed to enhance situational awareness for soldiers on the ground. The company says its special radar is designed for warfighters as part ...
HyperSolar shows dirty water no barrier to power world
(Phys.org) -- The Santa Barbara, California, company, HyperSolar, is set to transparently share the ups and downs of its research experiences toward the companys ultimate vision, successfully producing ...
Tesla to launch electric sedan in US on June 22
Tesla Motors said Tuesday it would begin deliveries of "the world's first premium electric sedan" on June 22, slightly ahead of schedule.
Technology / Energy & Green Tech
May 22, 2012 |
4.5 / 5 (11) |
18
Nvidia trumpets Tegra 3 phone design wins for 2012
(Phys.org) -- Nvidias competitive war paint has a name, Tegra 3. On the heels of Nvidia announcements about lowering costs of its Tegra 3 processors and Nvidia-enabled tablets running Android Ice Cream ...
Scientist: Evolution debate will soon be history
(AP) -- Richard Leakey predicts skepticism over evolution will soon be history. Not that the avowed atheist has any doubts himself.
Dell tablet leak: 10.1-inch display, two-battery choice
(Phys.org) -- Headline after headline talks about vendors tablets in the wings as likely number-one contenders for the iPad. Such claims have justifiably been taken with a grain of salt, considering ...
Keep food safety in mind this memorial day weekend
(HealthDay) -- Picnics, parades and cookouts are as much a part of Memorial Day weekend as tributes to the United States' war veterans.
Social welfare cuts ultimately come with heavy price, researchers say
(Phys.org) -- Slashing government funding for Medicaid, food stamps and other programs that serve the poor while politically popular with some lawmakers and many conservatives may do more harm ...
Is a classical electrodynamics law incompatible with special relativity?
(Phys.org) -- The laws of classical electromagnetism that were developed in the 19th century are the same laws that scientists use today. They include Maxwell’s four equations along with the Lorentz la ...
Jun 08, 2011
Rank: not rated yet
For cost I would not put more cores in than I need, if adding a core adds 10% to the packaged chip cost then it only reduces cost if the yield was below 90% after packaging, and assuming that 100% of the fault tolerant chips were usable.
When we have to think carefully about every 0.1 square mm of silicon this is not an option.
I'd love to see some analysis of the cost impact of this versus the impact on speed / power / overall yield.
Jun 08, 2011
Rank: not rated yet
Jun 08, 2011
Rank: not rated yet
It is certainly possible - with a little hardware redundancy - to produce a CPU which upon failure can signal to an external program that a failure in component xyz has occurred. This signal would then be used to produce a custom microcode program that uses spare capacity in the CPU to work around the failure.
For example latching the data in register A to an internal bus, might normally use a microcode instruction xxx01xxx... If the latch fails, the instruction might be replaced with xxx10xxx... with 10 instructing a redundant secondary latch to perform the same function.
There are a host of possibilities along these lines.
Jun 08, 2011
Rank: not rated yet
This is done by modifying the cache controller so that access of a variable by CPU2 while CPU1 has the same variable, causes CPU2 to revert to a previous computational state (before the access). The price paid for dropping the calculations is offset by the increased efficiency with which the CPU's can share the RAM resource.
In any case, from the comment above, if the CPU detects a latch failure, it could also revert to a previous state in such a multi-cpu system. At that point, it's state could be saved, the microcode reprogrammed, the state restored, and the CPU could continue as if there had been no fault.
Jun 08, 2011
Rank: 5 / 5 (1)
My comment is that the benefits of this for most commercial devices would not offset the additional cost of having spare resources. I simply disagree with the sentiment of some of the claims which implies that this will make it possible to design chips with an expectation that they will have faults that can be worked around by redundancy.
As a silicon chip architect I would never condone shipping a product that had failed production test partly in a logic domain, it's a bit different for memory ( flash / dram cells etc) but when an error occurs in logic you do not know the cause without physical examination so the chance of a device then failing completely in the field is greatly increased.
Jun 25, 2011
Rank: not rated yet