Researchers build self-repairing "systemic" computer

Feb 15, 2013 by Bob Yirka report
Researchers build self-repairing
'Blue screen of death' as seen in Windows 8. Credit: Wikipedia

(Phys.org)—Computer scientists Christos Sakellariou and Peter Bentley working together at University College in London, have built a new kind of computer that runs instruction segments randomly, rather than sequentially, resulting in a computer than in theory, should never crash.

One of the main reasons that computers crash is because of the way they execute instructions, i.e. sequentially. Code is written in a step-by-step fashion and the follows a counter that retrieves lines of code in the proper order, executing each one before moving on to the next. Problems come in when the counter becomes mixed up, or code that has been executed fails to return control so that the next line can be run. To get around that problem, the researchers in Britain have built a computer that doesn't run sequentially at all. It runs chunks of information that is made up of both code and data, and does so in random fashion, removing the sequential processing problem. The result, they say, is a computer that is able to repair itself on the fly, and won't theoretically ever crash.

The whole idea is based on nature's distributed processing abilities as demonstrated by such brilliant constructs as the human brain. As people exist, they think, they react and respond. They do all manner of things, none of which occurs as the result of a sequential processer in a central part of the brain. Instead, things are done in a distributed manner, with different biological processors working on different things at the same time. To make this happen with a computer, the researchers built a (FPGA) which is essentially a bit of electronics that serves as a sort of traffic cop. Its main job is to make sure that different segments or "systems" as the researchers call them, get called on, albeit, in random fashion, and to allocate a place for them to run. One of the benefits of such a system is that no system has to wait for another to finish before running, which means the computer can run several systems at the same time. Thus, the FPGA is a resource manager, though it also serves as the manager of information that flows between systems.

Because the systems are independent of one another, there is no crash if one of them is unable to carry out its instructions. But better than, that, other systems can be introduced into the system whose purpose is to detect problems with other systems and rerun them if necessary, or to change them slightly, if need be, to allow them to complete their assigned tasks. In the computer world, that's known as self-repairing code and it's something many people would like to see in computers running in the real world. With this new computer, it's been demonstrated that such a computer can be built.

Explore further: Researchers use Twitter to predict crime

More information: www0.cs.ucl.ac.uk/staff/ucacpjb/SABEC2.pdf

Related Stories

An operating system in the cloud

Oct 09, 2012

Computer users are familiar to different degrees with the operating system that gets their machines up and running, whether that is the Microsoft Windows, Apple Mac, Linux, ChromeOS or other operating system. The OS handles ...

Campaign to build 1837 Babbage's Analytical Engine

Oct 19, 2010

(PhysOrg.com) -- A campaign based in the UK is hoping to construct Charles Babbage's steam-powered Analytical Engine, a prototype computer around the size of a steam locomotive, which Babbage designed in 18 ...

Recommended for you

User comments : 14

Adjust slider to filter visible comments by rank

Display comments: newest first

flashgordon
1.2 / 5 (5) Feb 15, 2013
Sounds Minsky like; surprised this hadn't been done sooner.
Eikka
1.9 / 5 (8) Feb 15, 2013
Problems come in when the counter becomes mixed up, or code that has been executed fails to return control so that the next line can be run.


Or the program runs into a state where it cannot continue, even though the computer itself isn't stuck. It's useless to continue computing it, so it signals a fault. If that program is the user interface, or the graphics driver etc. it's as good as if the whole computer had crashed.

I fail to see the difference to pre-emptive multitasking anyhow. There's a scheduler program that is called periodically by a hardware interrupt, that does essentially what their FPGA does.

The reason why you can see the Blue Screen of Death in the first place is because the scheduler program takes over from the crashed program and transfers control to another program that shows you the message. It does so because the program that failed did something essential, and the system can't continue to function normally without it.
Sanescience
1.8 / 5 (5) Feb 15, 2013
This should be more specific by saying a class of problems that cause a computer to crash. I don't think this changes Touring principles or P vs NP completeness. And doesn't help many forms of computations, like deciding if a point on the plane of imaginary numbers belongs to the Mandelbrot set.
Tangent2
1.6 / 5 (5) Feb 15, 2013
The reason why you can see the Blue Screen of Death in the first place is because the scheduler program takes over from the crashed program and transfers control to another program that shows you the message. It does so because the program that failed did something essential, and the system can't continue to function normally without it.


This is true, but this is also done in a sequential manner, and that was the main point of the article. With random processing abilities, even if the system does get stuck on a sequential process and the others cannot proceed without the system aborting that process. In the random process setup, even if one process is stuck, the other processes can still be executed on the sub systems of the random process.
grondilu
5 / 5 (2) Feb 15, 2013
Interesting. But a huge paradigm shift. I have difficulties picturing how to program on such a computer.

For some reason I suspect it might be relevant to the GNU Hurd project.
Eikka
1.6 / 5 (8) Feb 15, 2013
This is true, but this is also done in a sequential manner, and that was the main point of the article.


And my main point is that there's no relevant difference between sequential execution of the scheduler program. The hardware itself switches over to the scheduler program periodically, so you can't get permanently stuck in a process. You need hardware support for pre-emptive multitasking.

The scheduler program is a short, simple, very optimized piece of code. The failure of the scheduler is phenomenally rare, and pretty much the only reason why it would fail is because of hardware fault and the resulting data corruption.

In the case that the hardware is faulty or unreliable, the error tolerance of the non-sequential scheduler is inconsequential because you can't trust the machine to reliably compute code anyhow. By the time you (the program) realize there's an error, the glitch has already caused data damage and you have to restart.

A broken computer is a broken computer
baudrunner
1.6 / 5 (7) Feb 15, 2013
Reads to me like using a baiting technique which creates greater potential for system crashes so that pre-coded error correcting algorithms can alter the model's projection such that instruction execution continuity can be maintained via divergence. That would be the ultimate application of this principle, not, in my opinion, one that I would necessarily want to buy into. It increases the potential for a data processing software to produce unanticipated, read 'wrong', results.
Foolish1
4 / 5 (1) Feb 15, 2013
We can outsmart a $2 calculator but we can't out add or out subtract one. These free for all schemes are great at simulating brains and incredibly wasteful when it comes to achiving deterministic results.

The reality is fault tolerance is ususally a software problem more than a hardware one.
Tausch
1.7 / 5 (6) Feb 15, 2013
lol
".....we're just collecting some error info, and then we're restart you...." - author unknown (Bill Gates?)


See article's illustration - faded BSOD.

I can't live without my minidumps.

I know what you are all thinking:
What a lurid statement!
Well, poo poo on you too.

alfie_null
4 / 5 (1) Feb 16, 2013
As discussed in the paper, the novel aspect here is the use of FPGAs to implement Systemic Computation. Other implementations exist and are cited in the references. The parts they use are still a little high-end for kitchen-table experimentation, but in a few years, who knows?
Anda
5 / 5 (1) Feb 16, 2013
If the blue screen is caused by hardware failure you'll have to replace it.
If the cause is software it means the OS is a s...
That's all.
wealthychef
not rated yet Feb 16, 2013
I don';t understand this. The reason that a program proceeds in a certain order is that the bits later on depend on the bits earlier on. If you can execute statements in random order, then they cannot have data dependencies on each other. In which case, there cannot be much power to the computation unless it is embarrassingly parallel somehow.
VendicarE
3.3 / 5 (3) Feb 17, 2013
The shuttles computers were close to being self repairing.

Triple redundancy with cross checking of computations. If one section of the computer failed, the redundant element would be used.

In order for there to have been self repair, more computational blocks would have been required to be present along with a method of starting these blocks as required should other blocks go down.

Current technology is such that it is now possible to create CPU's from FPGA's that can literally route around damage to the internal Computational elements, although it would require an alternate active core to do it.

cyberCMDR
4 / 5 (1) Feb 17, 2013
Sounds like functional programming (e.g. Haskell), but perhaps now at the hardware level.

More news stories

Ex-Apple chief plans mobile phone for India

Former Apple chief executive John Sculley, whose marketing skills helped bring the personal computer to desktops worldwide, says he plans to launch a mobile phone in India to exploit its still largely untapped ...

A homemade solar lamp for developing countries

(Phys.org) —The solar lamp developed by the start-up LEDsafari is a more effective, safer, and less expensive form of illumination than the traditional oil lamp currently used by more than one billion people ...

NASA's space station Robonaut finally getting legs

Robonaut, the first out-of-this-world humanoid, is finally getting its space legs. For three years, Robonaut has had to manage from the waist up. This new pair of legs means the experimental robot—now stuck ...