Particles from outer space are wreaking low-grade havoc on personal electronics

February 17, 2017
Estimated failure rates from single event upsets at the transistor, integrated circuit and device level for the last three semiconductor architectures. Credit: Bharat Bhuva, Vanderbilt University

You may not realize it but alien subatomic particles raining down from outer space are wreaking low-grade havoc on your smartphones, computers and other personal electronic devices.

When your computer crashes and you get the dreaded blue screen or your smartphone freezes and you have to go through the time-consuming process of a reset, most likely you blame the manufacturer: Microsoft or Apple or Samsung. In many instances, however, these operational failures may be caused by the impact of electrically charged particles generated by cosmic rays that originate outside the solar system.

"This is a really big problem, but it is mostly invisible to the public," said Bharat Bhuva, professor of electrical engineering at Vanderbilt University, in a presentation on Friday, Feb. 17 at a session titled "Cloudy with a Chance of Solar Flares: Quantifying the Risk of Space Weather" at the annual meeting of the American Association for the Advancement of Science in Boston.

When cosmic rays traveling at fractions of the speed of light strike the Earth's atmosphere they create cascades of secondary particles including energetic neutrons, muons, pions and alpha particles. Millions of these particles strike your body each second. Despite their numbers, this subatomic torrent is imperceptible and has no known harmful effects on living organisms. However, a fraction of these particles carry enough energy to interfere with the operation of microelectronic circuitry. When they interact with integrated circuits, they may alter individual bits of data stored in memory. This is called a single-event upset or SEU.

Since it is difficult to know when and where these particles will strike and they do not do any physical damage, the malfunctions they cause are very difficult to characterize. As a result, determining the prevalence of SEUs is not easy or straightforward. "When you have a single bit flip, it could have any number of causes. It could be a software bug or a hardware flaw, for example. The only way you can determine that it is a single-event upset is by eliminating all the other possible causes," Bhuva explained.

There have been a number of incidents that illustrate how serious the problem can be, Bhuva reported. For example, in 2003 in the town of Schaerbeek, Belgium a bit flip in an added 4,096 extra votes to one candidate. The error was only detected because it gave the candidate more votes than were possible and it was traced to a single bit flip in the machine's register. In 2008, the avionics system of a Qantus passenger jet flying from Singapore to Perth appeared to suffer from a single-event upset that caused the autopilot to disengage. As a result, the aircraft dove 690 feet in only 23 seconds, injuring about a third of the passengers seriously enough to cause the aircraft to divert to the nearest airstrip. In addition, there have been a number of unexplained glitches in airline computers - some of which experts feel must have been caused by SEUs - that have resulted in cancellation of hundreds of flights resulting in significant economic losses.

An analysis of SEU failure rates for consumer electronic devices performed by Ritesh Mastipuram and Edwin Wee at Cypress Semiconductor on a previous generation of technology shows how prevalent the problem may be. Their results were published in 2004 in Electronic Design News and provided the following estimates:

  • A simple cell phone with 500 kilobytes of memory should only have one potential error every 28 years.
  • A router farm like those used by Internet providers with only 25 gigabytes of memory may experience one potential networking error that interrupts their operation every 17 hours.
  • A person flying in an airplane at 35,000 feet (where radiation levels are considerably higher than they are at sea level) who is working on a laptop with 500 kilobytes of memory may experience one potential error every five hours.

Bhuva is a member of Vanderbilt's Radiation Effects Research Group, which was established in 1987 and is the largest academic program in the United States that studies the effects of radiation on electronic systems. The group's primary focus was on military and space applications. Since 2001, the group has also been analyzing radiation effects on consumer electronics in the terrestrial environment. They have studied this phenomenon in the last eight generations of computer chip technology, including the current generation that uses 3D transistors (known as FinFET) that are only 16 nanometers in size.

The 16-nanometer study was funded by a group of top microelectronics companies, including Altera, ARM, AMD, Broadcom, Cisco Systems, Marvell, MediaTek, Renesas, Qualcomm, Synopsys, and TSMC

"The semiconductor manufacturers are very concerned about this problem because it is getting more serious as the size of the transistors in computer chips shrink and the power and capacity of our digital systems increase," Bhuva said. "In addition, microelectronic circuits are everywhere and our society is becoming increasingly dependent on them."

To determine the rate of SEUs in 16-nanometer chips, the Vanderbilt researchers took samples of the to the Irradiation of Chips and Electronics (ICE) House at Los Alamos National Laboratory. There they exposed them to a neutron beam and analyzed how many SEUs the chips experienced. Experts measure the failure rate of microelectronic circuits in a unit called a FIT, which stands for failure in time. One FIT is one failure per transistor in one billion hours of operation. That may seem infinitesimal but it adds up extremely quickly with billions of transistors in many of our devices and billions of electronic systems in use today (the number of smartphones alone is in the billions). Most electronic components have failure rates measured in 100's and 1,000's of FITs.

"Our study confirms that this is a serious and growing problem," said Bhuva. "This did not come as a surprise. Through our research on radiation effects on electronic circuits developed for military and space applications, we have been anticipating such effects on electronic systems operating in the terrestrial environment."

Although the details of the Vanderbilt studies are proprietary, Bhuva described the general trend that they have found in the last three generations of integrated circuit technology: 28-nanometer, 20-nanometer and 16-nanometer.

As transistor sizes have shrunk, they have required less and less electrical charge to represent a logical bit. So the likelihood that one bit will "flip" from 0 to 1 (or 1 to 0) when struck by an energetic particle has been increasing. This has been partially offset by the fact that as the transistors have gotten smaller they have become smaller targets so the rate at which they are struck has decreased.

More significantly, the current generation of 16-nanometer circuits have a 3D architecture that replaced the previous 2D architecture and has proven to be significantly less susceptible to SEUs. Although this improvement has been offset by the increase in the number of transistors in each chip, the failure rate at the chip level has also dropped slightly. However, the increase in the total number of transistors being used in new electronic systems has meant that the SEU failure rate at the device level has continued to rise.

Unfortunately, it is not practical to simply shield microelectronics from these energetic particles. For example, it would take more than 10 feet of concrete to keep a circuit from being zapped by energetic neutrons. However, there are ways to design computer chips to dramatically reduce their vulnerability.

For cases where reliability is absolutely critical, you can simply design the processors in triplicate and have them vote. Bhuva pointed out: "The probability that SEUs will occur in two of the circuits at the same time is vanishingly small. So if two circuits produce the same result it should be correct." This is the approach that NASA used to maximize the reliability of spacecraft computer systems.

The good news, Bhuva said, is that the aviation, medical equipment, IT, transportation, communications, financial and power industries are all aware of the problem and are taking steps to address it. "It is only the consumer electronics sector that has been lagging behind in addressing this problem."

The engineer's bottom line: "This is a major problem for industry and engineers, but it isn't something that members of the general public need to worry much about."

Explore further: Reconfigurable chaos-based microchips offer possible solution to Moore's law

Related Stories

The future of electronics is light

November 29, 2016

For the past four decades, the electronics industry has been driven by what is called "Moore's Law," which is not a law but more an axiom or observation. Effectively, it suggests that the electronic devices double in speed ...

Rice-made memory chips headed to space

May 24, 2011

Rice University will send an experiment to the International Space Station (ISS) later this year. If all goes perfectly, it will be precisely the same when it returns two years later.

Recommended for you

Smartphones are revolutionizing medicine

February 18, 2017

Smartphones are revolutionizing the diagnosis and treatment of illnesses, thanks to add-ons and apps that make their ubiquitous small screens into medical devices, researchers say.

Six-legged robots faster than nature-inspired gait

February 17, 2017

When vertebrates run, their legs exhibit minimal contact with the ground. But insects are different. These six-legged creatures run fastest using a three-legged, or "tripod" gait where they have three legs on the ground at ...

13 comments

Adjust slider to filter visible comments by rank

Display comments: newest first

barakn
5 / 5 (4) Feb 17, 2017
For example, it would take more than 10 feet of concrete to keep a circuit from being zapped by energetic neutrons.

If I was trying to shield against energetic neutrons, concrete is one of the last things I would think of. Instead I'd consider something with a high density of hydrogen like wood, plastic, or even water. The almost equal mass of the proton and neutron allows for a large transfer of momentum from the neutron to the proton during scattering events, allowing the neutrons to be quickly slowed to thermal speeds and greatly increasing the chance they'll absorbed the next time they encounter an atomic nucleus.
LRW
not rated yet Feb 17, 2017
Don`t forget Nuclear Missile control systems---
Tincho147
not rated yet Feb 17, 2017
About a month ago I bought an Asus ZenPad (z500m) and got on a plane from Italy to Spain. About an hour into the flight I wanted to read a note I had on the tablet but when I pressed the power button my screen was displaying an obvious software breakdown. The tablet had less than two days. Now I'm questioning my original version of having a mid-range tablet since it has worked perfectly since and I use it heavily every day.
julianpenrod
1 / 5 (6) Feb 17, 2017
One can wonder where the recent revelation, also in Phys Org, of "clouds" where radiation from outer space is particularly heavy, fits with this. The radiation permeable "clouds" seem a recent development. And one would think that this problem, to one degree or another, would have been noticed long before this, even with larger system elements. One would also think, if this were such a problem, that sensitive systems would have been designed to avoid it. If software failures can cause the same effects, one can wonder why the drop of the plane and "diversion" and the shutting down of air traffic didn't take place before. One can wonder where the requirement that electronics not be used on flights relates to this. In fact, the "diversion" and airport shutdown sound much like dodges airlines use to avoid flying into air denatured by chemtrails so it won't support aircraft, anymore.
Nik_2213
5 / 5 (1) Feb 17, 2017
Uh, didn't servers used to have parity-checking RAM ??
manfredparticleboard
not rated yet Feb 17, 2017
I think my assessment of very very very unlikely is supported, especially on the ground, when I posted about this in the ionizing radiation cloud thread. Multiple redundancies in avionics systems are pretty good at screening out any danger. But no system is perfect and there can always be a weakspot or vulnerability to this kind of interference, it's just a case of trying to hit a very small target.
Denatured air? What frazzle haired quackery is this? Can you put any SI units to this effect you refer to. Milli mol of delusion per cubic meter perhaps?
Pooua
5 / 5 (6) Feb 17, 2017
@julianpenrod As someone who has been following information about the radiation environment in our atmosphere and space, I suggest that you broaden your sources of information on the subject from just a casual perusal of phys.org articles. You are making erroneous extrapolations. Namely, the effects of space radiation has been known for a long time. At times, a regional power grid has tripped offline due to space weather.

Chemtrails is a hoax.
IronhorseA
not rated yet Feb 17, 2017
Uh, didn't servers used to have parity-checking RAM ??


Yes, but it adds latency to memory access, which is something that gamer's try to avoid.
StudentofSpiritualTeaching
Feb 17, 2017
This comment has been removed by a moderator.
rrrander
not rated yet Feb 18, 2017
They mentioned alpha particles. Even energetic ones can't travel more than an inch or two in air, so how would they make it through the atmosphere? I've also seen cellphones bombarded with gamma radiation that would have killed a human, they still worked well, but I supposed some functions might have been compromised, had anyone been using it at the time
Gigel
5 / 5 (1) Feb 18, 2017
It's muons that are produced by cosmic radiation and penetrate deep into the atmosphere and even into the ground. But muons cannot be stopped easily; it would take hundreds of meters of ground to reduce them significantly. Muons can produce other particles though by disintegration and maybe by collisions.
Eikka
not rated yet Feb 18, 2017
Where can I buy 0.5 MB RAM notebooks?


That's a throwback to times when a 486DX2 laptop running DOS would have 640 kB of main memory, and 4-8 MB of extended memory. Probably somewhere around 1996.

These machines can be still found online for about $20.
randomcyborg
not rated yet Feb 20, 2017
Everyone in the field has known about this stuff for decades.

Error correction codes (instead of much simpler error detection codes) are often used when it's effectively impossible for the receiver to ask the sender to retransmit the last message, such as sending instructions to a Mars rover, or real-time situations.

An extremely common situation is where it's actually impossible (as opposed to effectively impossible) to retransmit data simply because the data doesn't exist any more -- during the execution of a program. For example, a variable's value has been overwritten, or the variable, itself, no longer exists because it was local to a subroutine that has returned. The subroutine could be called again, but that variable will most likely have a different value during the subroutine's current call than during the subroutine's previous call.

Virtually all error detection and error correction is done in hardware -- so plan ahead.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.