Researchers combine logic, memory to build a 'high-rise' chip

December 15, 2014
This illustration represents the four-layer prototype high-rise chip built by Stanford engineers. The bottom and top layers are logic transistors. Sandwiched between them are two layers of memory. The vertical tubes are nanoscale electronic "elevators" that connect logic and memory, allowing them to work together to solve problems. Credit: Max Shulaker, Stanford

For decades, the mantra of electronics has been smaller, faster, cheaper. Today, Stanford engineers add a fourth word - taller.

At a conference in San Francisco, a Stanford team will reveal how to build high-rise chips that could leapfrog the performance of the single-story and chips on today's circuit cards.

Those circuit cards are like busy cities in which logic chips compute and memory chips store data. But when the computer gets busy, the wires connecting logic and memory can get jammed.

The Stanford approach would end these jams by building layers of logic atop layers of memory to create a tightly interconnected high-rise chip. Many thousands of nanoscale electronic "elevators" would move data between the layers much faster, using less electricity, than the bottle-neck prone wires connecting single-story logic and memory chips today.

The work is led by Subhasish Mitra, a Stanford professor of electrical engineering and computer science, and H.-S. Philip Wong, the Williard R. and Inez Kerr Bell Professor in Stanford's School of Engineering. They describe their new high-rise chip architecture in a paper being presented at the IEEE International Electron Devices Meeting on Dec. 15-17.

The researchers' innovation leverages three breakthroughs.

The first is a new technology for creating transistors, those tiny gates that switch electricity on and off to create digital zeroes and ones. The second is a new type of computer memory that lends itself to multi-story fabrication. The third is a technique to build these new logic and memory technologies into high-rise structures in a radically different way than previous efforts to stack chips.

"This research is at an early stage, but our design and fabrication techniques are scalable," Mitra said. "With further development this architecture could lead to computing performance that is much, much greater than anything available today."

Wong said the prototype chip unveiled at IEDM shows how to put logic and memory together into three-dimensional structures that can be mass-produced.

"Paradigm shift is an overused concept, but here it is appropriate," Wong said. "With this new architecture, electronics manufacturers could put the power of a supercomputer in your hand."

Silicon heat

Engineers have been making silicon chips for decades, but the heat emanating from phones and laptops is evidence of a problem. Even when they are switched off, some electricity leaks out of silicon transistors. Users feel that as heat. But at a system level, the leakage drains batteries and wastes electricity.

Researchers have been trying to solve this major problem by creating carbon nanotubes - or CNT - transistors. They are so slender that nearly 2 billion CNTs could fit within a human hair. CNTs should leak less electricity than silicon because their tiny diameters are easier to pinch shut.

The image on the left depicts today's single-story electronic circuit cards, where logic and memory chips exist as separate structures, connected by wires. Like city streets, those wires can get jammed with digital traffic going back and forth between logic and memory. On the right, Stanford engineers envision building layers of logic and memory to create skyscraper chips. Data would move up and down on nanoscale "elevators" to avoid traffic jams. Credit: Wong/Mitra Lab, Stanford

Mitra and Wong are presenting a second paper at the conference showing how their team made some of the highest performance CNT transistors ever built.

They did this by solving a big hurdle: packing enough CNTs into a small enough area to make a useful chip.

Until now the standard process used to grow CNTs did not create a sufficient density of these tubes. The Stanford engineers solved this problem by developing an ingenious technique.

They started by growing CNTs the standard way, on round quartz wafers. Then they added their trick. They created what amounts to a metal film that acts like a tape. Using this adhesive process they lifted an entire crop of CNTs off the quartz growth medium and placed it onto a silicon wafer.

This silicon wafer became the foundation of their high-rise chip.

But first they had to fabricate a CNT layer with sufficient density to make a high performance logic device. So they went though this process 13 times, growing a crop of CNTs on the quartz wafer, and then using their transfer technique to lift and deposit these CNTs onto the silicon wafer.

Using this elegant technological fix, they achieved some of the highest density, highest performance CNTs ever made - especially given that they did this in an academic lab with less sophisticated equipment than a commercial fabrication plant.

Moreover, the Stanford team showed that they could perform this technique on more than one layer of logic as they created their high-rise chip.

What about the memory?

Creating high-performance layers of CNT transistors was only part of their innovation. Just as important was their ability to build a new type of memory directly atop each layer of CNTs.

Wong is a world leader in this new memory technology, which he unveiled at last year's IEDM conference.

Unlike today's , this new storage technology is not based on silicon.

Instead, the Stanford team fabricated memory using titanium nitride, hafnium oxide and platinum. This formed a metal/oxide/metal sandwich. Applying electricity to this three-metal sandwich one way causes it to resist the flow of electricity. Reversing the electric jolt causes the structure to conduct electricity again.

The change from resistive to conductive states is how this new memory technology creates digital zeroes and ones. The change in conductive states also explains its name: resistive random access memory, or RRAM.

Wong designed RRAM to use less energy than current memory, leading to prolonged battery life in mobile devices.

Inventing this new memory technology was also the key to creating the high-rise chip because RRAM can be made at much lower temperatures than silicon memory.

Interconnected layers

Max Shulaker and Tony Wu, Stanford graduate students in electrical engineering, created the techniques behind the four-story high-rise chip unveiled at the conference.

Everything hinged on the low-heat process for making RRAM and CNTs, which enabled them to fabricate each layer of memory directly atop each layer of CNT logic. While making each memory layer, they were able to drill thousands of interconnections into the logic layer below.

This multiplicity of connections is what enables the high-rise chip to avoid the traffic jams on conventional circuit cards.

There is no way to tightly interconnect layers using today's conventional silicon-based logic and memory. That's because it takes so much heat to build a layer of silicon memory - about 1,000 degrees centigrade - that any attempt to do so would melt the logic below.

Previous efforts to stack could save space but not avoid the digital traffic jams. That's because each layer would have to be built separately and connected by wires—which would still be prone to traffic jams, unlike the nanoscale elevators in the Stanford design.

Explore further: A first: Stanford engineers build computer using carbon nanotube technology

Related Stories

Recommended for you

Microsoft aims at Apple with high-end PCs, 3D software

October 26, 2016

Microsoft launched a new consumer offensive Wednesday, unveiling a high-end computer that challenges the Apple iMac along with an updated Windows operating system that showcases three-dimensional content and "mixed reality."

Making it easier to collaborate on code

October 26, 2016

Git is an open-source system with a polarizing reputation among programmers. It's a powerful tool to help developers track changes to code, but many view it as prohibitively difficult to use.


Adjust slider to filter visible comments by rank

Display comments: newest first

1.2 / 5 (5) Dec 15, 2014
Parallel processing with extended storage, nothing new.
5 / 5 (2) Dec 15, 2014
Parallel processing with extended storage, nothing new.

Though more compact. One can fit four times the stuff on a single wafer with four layers, dropping costs per chip by four, or four times as much stuff per chip.

The only problems would be heat management and the endurance and speed of the new RRAM cells, because CPUs normally use SRAM cells which are both fast and have virtually infinite lifetime as compared to memories that rely on altering the material properties of a medium.

The downside of SRAM of course is its higher power demand, so the RRAM may actually be necessary for the stacked chip to function at all, since the "elevators" or inter-layer vias will be largely responsible for carrying the heat out from the middle layers. The electrical insulators between layers are also typically fairly good heat insulators because heat flows easily with the movement of electrons.

The heat management has been a major stumbling block to 3D chips.
not rated yet Dec 15, 2014
It looks like the state-of-art RRAM cells operate in the 100 MHz speed range for low current operation, and 3 GHz range for high current operation. They're still slightly slow for mobile processors, except for simple embedded controllers where speed is not required.
not rated yet Dec 15, 2014
rram cam hanging off the fpga bus , but how do you i/o the graphs to secondary ?
the third wave , von neuman , gpgpu , and fpga/cam
not rated yet Dec 15, 2014
"With this new architecture, electronics manufacturers could put the power of a supercomputer in your hand."
Wrong. Here is why.

The world's first supercomputer, Control Data Corporation's CDC 6600, was capable of 1 megaflop/sec. Today's high-end PC's can perform about 7 gigaflops/sec. Today, the fastest supercomputer in the world, China's Tianhe-2, can perform almost 34 petaflops/sec.

All things being relative, by the time we see a new CPU using this towering architecture on our desks, the supercomputer will be faster still. I certainly don't think a PC will ever reach the petaflop speed, because Tianhe-2 uses 16,000 computer nodes, each comprising two Intel Ivy Bridge Xeon processors and three Xeon Phi chips, for a total of 3,120,000 cores. You won't see that in your desktop anytime soon.
5 / 5 (1) Dec 15, 2014
When your memory is slow, you want to make it more parallel. In this presented design, every logic (luster) can have their own local lane to near memory, taking out the current bottleneck that is (narrow) memory bus. To be able to schrink everything, all parts must be as low power as possible. There this RRAM helps, because in theory it does not need electricity to hold information.
5 / 5 (1) Dec 16, 2014
because in theory it does not need electricity to hold information.

But it needs current to read it, because the measurement of the data is done by passing a current through the resistance, which of course causes Joule heating. This current then flips the state of the bit, or alters it slightly so it has to be written back.

Apparently the RRAM cells today need on the order of 1 µA of current to read/write so if you're constantly accessing data, that can stack up to quite a lot.

A CPU even when idle is executing some program, even if it's just waiting for an interrupt to happen, because it has all sorts of clocks and counters and schedulers running that facilitate multitasking, so it's constantly reading and writing to the local cache and registers.

not rated yet Dec 21, 2014
RRAMs so close to logic-dies would definitely increase the performance of these Systems-on-Stacks. By sandwiching RRAM between two logic layers, there will be considerable reduction in hot-spot effects too. Also un-wanted parasitic effects could be reduced.

If chips like these are mass-produced, AI enthusiasts could use them for Neuromorphic computing. Wonder what are the applications of these chips created at Mitra/Wong Lab of Stanford University? If there was some ROI, by now big Semi companies would have Fabs for the same, isn't it? Hence, what are the obstacles from commercializing this technology?

BTW, I did a pet-project to come up with adiabatic logic gates based of 2×1 multiplexers for similar CN+CMOS chips in 2005. At that time, there were no effective EDA tools to construct these chips. Wonder what EDA software tools they used to build this chip at this Stanford lab?

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.