Three-dimensional chip combines computing and data storage

Three-dimensional chip combines computing and data storage
The 3D nanosystem. Credit: Nature (2017). DOI: 10.1038/nature22994

As embedded intelligence is finding its way into ever more areas of our lives, fields ranging from autonomous driving to personalized medicine are generating huge amounts of data. But just as the flood of data is reaching massive proportions, the ability of computer chips to process it into useful information is stalling.

Now, researchers at Stanford University and MIT have built a new chip to overcome this hurdle. The results are published today in the journal Nature, by lead author Max Shulaker, an assistant professor of electrical engineering and computer science at MIT. Shulaker began the work as a PhD student alongside H.-S. Philip Wong and his advisor Subhasish Mitra, professors of and computer science at Stanford. The team also included professors Roger Howe and Krishna Saraswat, also from Stanford.

Computers today comprise different chips cobbled together. There is a chip for computing and a separate chip for data storage, and the connections between the two are limited. As applications analyze increasingly massive volumes of data, the limited rate at which data can be moved between different chips is creating a critical communication "bottleneck." And with limited real estate on the chip, there is not enough room to place them side-by-side, even as they have been miniaturized (a phenomenon known as Moore's Law).

To make matters worse, the underlying devices, transistors made from silicon, are no longer improving at the historic rate that they have for decades.

The new prototype chip is a radical change from today's chips. It uses multiple nanotechnologies, together with a new computer architecture, to reverse both of these trends.

Instead of relying on silicon-based devices, the chip uses carbon nanotubes, which are sheets of 2-D graphene formed into nanocylinders, and resistive random-access memory (RRAM) cells, a type of nonvolatile memory that operates by changing the resistance of a solid dielectric material. The researchers integrated over 1 million RRAM cells and 2 million field-effect transistors, making the most complex nanoelectronic system ever made with emerging nanotechnologies.

The RRAM and carbon nanotubes are built vertically over one another, making a new, dense 3-D computer architecture with interleaving layers of logic and memory. By inserting ultradense wires between these layers, this 3-D architecture promises to address the communication bottleneck.

However, such an architecture is not possible with existing silicon-based technology, according to the paper's lead author, Max Shulaker, who is a core member of MIT's Microsystems Technology Laboratories. "Circuits today are 2-D, since building conventional silicon transistors involves extremely high temperatures of over 1,000 degrees Celsius," says Shulaker. "If you then build a second layer of silicon circuits on top, that high temperature will damage the bottom layer of circuits."

The key in this work is that carbon nanotube circuits and RRAM memory can be fabricated at much lower temperatures, below 200 C. "This means they can be built up in layers without harming the circuits beneath," Shulaker says.

This provides several simultaneous benefits for future computing systems. "The devices are better: Logic made from carbon nanotubes can be an order of magnitude more energy-efficient compared to today's logic made from silicon, and similarly, RRAM can be denser, faster, and more energy-efficient compared to DRAM," Wong says, referring to a conventional memory known as dynamic random-access memory.

"In addition to improved devices, 3-D integration can address another key consideration in systems: the interconnects within and between chips," Saraswat adds.

"The new 3-D computer architecture provides dense and fine-grained integration of computating and data storage, drastically overcoming the bottleneck from moving data between chips," Mitra says. "As a result, the chip is able to store massive amounts of data and perform on-chip processing to transform a data deluge into useful information."

To demonstrate the potential of the technology, the researchers took advantage of the ability of carbon nanotubes to also act as sensors. On the top layer of the chip they placed over 1 million carbon nanotube-based sensors, which they used to detect and classify ambient gases.

Due to the layering of sensing, , and computing, the chip was able to measure each of the sensors in parallel, and then write directly into its memory, generating huge bandwidth, Shulaker says.

"One big advantage of our demonstration is that it is compatible with today's silicon infrastructure, both in terms of fabrication and design," says Howe.

"The fact that this strategy is both CMOS [complementary metal-oxide-semiconductor] compatible and viable for a variety of applications suggests that it is a significant step in the continued advancement of Moore's Law," says Ken Hansen, president and CEO of the Semiconductor Research Corporation, which supported the research. "To sustain the promise of Moore's Law economics, innovative heterogeneous approaches are required as dimensional scaling is no longer sufficient. This pioneering work embodies that philosophy."

The team is working to improve the underlying nanotechnologies, while exploring the new 3-D computer architecture. For Shulaker, the next step is working with Massachusetts-based semiconductor company Analog Devices to develop new versions of the system that take advantage of its ability to carry out sensing and data processing on the same .

So, for example, the devices could be used to detect signs of disease by sensing particular compounds in a patient's breath, says Shulaker.

"The technology could not only improve traditional computing, but it also opens up a whole new range of applications that we can target," he says. "My students are now investigating how we can produce chips that do more than just computing."

"This demonstration of the 3-D integration of sensors, memory, and logic is an exceptionally innovative development that leverages current CMOS technology with the new capabilities of carbon nanotube field-effect transistors," says Sam Fuller, CTO emeritus of Analog Devices, who was not involved in the research. "This has the potential to be the platform for many revolutionary applications in the future."

Explore further

Researchers combine logic, memory to build a 'high-rise' chip

More information: Max M. Shulaker et al, Three-dimensional integration of nanotechnologies for computing and data storage on a single chip, Nature (2017). DOI: 10.1038/nature22994
Journal information: Nature

Citation: Three-dimensional chip combines computing and data storage (2017, July 5) retrieved 22 August 2019 from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Feedback to editors

User comments

Jul 05, 2017
This device is very exciting, implementing lots of long-sought tech: 3D on-chip interconnects, VLSI CNT, layers from sensors through logic to storage, megaparallelism. But is it a "computer"?

This article and the paper's abstract indicate it's only some FETs in a very simple logic circuit inline with each sensor and memory cell. It's not a "processor", no instructions, no data integration across these simple stacked units, no communication with anything outside the chip. These are major differences from an actual "computer". There's no programming, no processing variation from the dedicated "store a slightly transformed value taken from the sensor", and no support for anything like programming. It's more like a dumb memory chip, though potentially far higher performance, with a tiny but static logic transform that doesn't communicate among the many values processed in parallel.

It's clearly a good start, but any basis for programmability is the watershed it needs to cross.

Jul 05, 2017
I also wonder about heat dissipation. The main bottleneck to improved CPU performance for over a decade has been getting all the heat generated by the huge and dense processing work off the chip before it fries the chip. I don't see any integrated cooling in this device, and it's extremely dense.

Meanwhile it's touted as using much cooler manufacturing to preserve the delicate components through the different layer building phases - which could imply they're even more vulnerable to heat. Each functional layer insulates the others, much worse than 2D chips. And CNTs are advantageous over Si transistors because they can be oscillated at far higher rates. So even if they're "an order of magnitude more energy-efficient" than Si, that's only a couple-few times higher density than Si before it hits the same thermal wall. Likely sooner due to the insulating (actually heat generating) layers.

Jul 06, 2017
Being an order of magnitude more energy-efficient than silicon should mean an order of magnitude less heat, but carbon also conducts heat better, so cooling should be more effective.

As for programming, video cards do many simple transforms on little bits of data in parallel, and you get fast graphics. That has been applied to some big data problems. Maybe it could be applied to general computing, and the x86 way of doing things could enter the dustbin of history.

Jul 06, 2017
Being an order of magnitude more energy-efficient than silicon should mean an order of magnitude less heat, but carbon also conducts heat better, so cooling should be more effective.

I don't know that the order of magnitude (10x) better performance per heat plus the carbon conduction is sufficient to overcome several magnitudes higher frequency and the 3D stacking of heat generation, as the heating volume to dissipating surface area grows worse as l^3:l^2.

In any case it's a primary issue that's not addressed in this article or the paper. Which makes me suspicious that it's a problem, or else MIT would surely brag on improving it.

Jul 06, 2017
As for programming, video cards do many simple transforms on little bits of data in parallel [...] fast graphics [...] big data [...] general computing [...] x86 [...] dustbin of history.

GPUs are extremely fast because they're highly parallel, but they're highly programmable, in "shader language" like Cg, GLSL, , PSSL, etc. "General Purpose GPU" (GPGPU) is harnessed for certain big data problems, and is the basis for crazy fast devices like Google's "Tensor Processing Unit" AI TPU, where each one is as fast as the world's current #7 fastest supercomputer - available cheap and easy to program in the cloud.

Nvidia's Tesla V100 GPU has 21 billion transistors in 5120 stream processors, so over 4.1 million transistors per parallel processor, with several programmable special-purpose shaders chained into a 4.1 million transistor pipeline in each processor. This 3D chip has a single transistor per unit, not even slightly programmable.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more