Petacache: Use that Memory

Mar 07, 2006
Petacache: Use that Memory
SLAC's Computerraum

For decades, high energy experimental physicists have struggled with a fundamental problem: they simply have too much data to analyze quickly and in its entirety.

BaBar researchers routinely wait nine months for computers to sift through large datasets, searching for interesting events and setting these aside for later analysis. This “data skimming” alone constantly uses about 50 percent of BaBar's computing power. And that’s before a researcher can even start analyzing her or his data. Preparing data from CERN's Large Hadron Collider (LHC) will only take longer.

Recognizing this widespread limitation, a team at SLAC is developing the PetaCache project, a new way of thinking about data access and storage. With new computer software and more efficient types of memory, PetaCache may significantly increase the speed of data analysis.

"PetaCache may help scientists change the way they think about exploring new ideas," said PetaCache project manager Randal Melen. "It will allow a physicist with a sudden new idea, an 'I wonder if…' moment, to quickly begin exploring that new idea."

Before the early 1990s, researchers analyzed much of their data from magnetic tape, having their computers spool through miles of it to find interesting events. As disk drives got larger and cheaper, and with the rise of computer clusters, much more of the data could be kept on disk. Yet these disks still required mechanical movement, limiting the speed at which researchers could begin accessing data. Computer technology has made great strides in speeding up the movement of data—called bandwidth—but the time to get the first byte of data—called latency—has been much slower to improve. "PetaCache, then, is really about improving the latency of testing new ideas," said Melen.

To do this, PetaCache uses several types of memory, not disks. Although memory is much faster at getting this first byte of data, in the past it has been too expensive to buy in the quantities necessary to record and analyze the massive amounts of data taken at particle accelerators. Today, DRAM (Dynamic Random Access Memory) and flash memory are more affordable, and flash memory is expected to continue to drop in price as it is used more and more in consumer electronics such as digital cameras, iPod-like devices, and cell phones. If successful, the PetaCache project will allow researchers to use both DRAM and flash memory on a large scale.

The prototype PetaCache system comprises two racks of 64 server computers, each with 16 gigabytes of DRAM for a total of one terabyte of memory. This large yet fragmented amount of memory is linked together with SCALLA (Structured Cluster Architecture for Low Latency Access), a computer program developed by SCCS Software Developer Andy Hanushevsky. SCALLA moves data from data servers to batch systems running physics analysis software with the lowest possible latencies. This load-balancing, self-organizing software distributes data across many data servers efficiently, making the individual machines appear as one huge chunk of memory to SCALLA-aware physics applications.

"The software makes good use of common hardware, so you don't have to make huge expenditures for great computing power," said Hanushevsky.

Right now, SLAC’s prototype system has one terabyte (1,000 gigabytes) of DRAM memory. With their next machine, the PetaCache team hopes to mainly use less expensive flash memory which, according to SCCS Director Richard Mount, "holds future promise of cost-effective memory-based data-analysis systems."

This second-generation prototype will aim at a few tens of terabytes of flash memory, which would make the system useful to BaBar and LSST researchers. In the next decade, the PetaCache team hopes to expand the system to a petabyte (1,000 terabytes). This is around the scale of what is needed to be useful at the LHC.

"Over the next few years, this type of memory technology will become much more common, from BaBar to the LHC to banks and airline reservation systems," said Research Director Emeritus David Leith. "They all benefit from being able to work from memory."

Source: Stanford Linear Accelerator Center, by Kelen Tuttle

Explore further: IHEP in China has ambitions for Higgs factory

add to favorites email to friend print save as pdf

Related Stories

How Kindle Unlimited compares with Scribd, Oyster

6 hours ago

Amazon is the latest—and largest—company to offer unlimited e-books for a monthly fee. Here's how Kindle Unlimited, which Amazon announced Friday, compares with rivals Scribd and Oyster.

NASA sees powerful thunderstorms in Tropical Storm Matmo

6 hours ago

Strong thunderstorms reaching toward the top of the troposphere circled Tropical Storm Matmo's center and appeared in a band of thunderstorms on the storm's southwestern quadrant. Infrared imagery from NASA's ...

ISS 'space truck' launch postponed: Arianespace

8 hours ago

The July 24 launch of a robot ship to deliver provisions to the International Space Station has been postponed "for a few days", space transport firm Arianespace said Friday.

Recommended for you

New approach to form non-equilibrium structures

1 hour ago

Although most natural and synthetic processes prefer to settle into equilibrium—a state of unchanging balance without potential or energy—it is within the realm of non-equilibrium conditions where new possibilities lie. ...

Nike krypton laser achieves spot in Guinness World Records

3 hours ago

A set of experiments conducted on the Nike krypton fluoride (KrF) laser at the U.S. Naval Research Laboratory (NRL) nearly five years ago has, at long last, earned the coveted Guinness World Records title for achieving "Highest ...

Unleashing the power of quantum dot triplets

6 hours ago

Quantum computers have yet to materialise. Yet, scientists are making progress in devising suitable means of making such computers faster. One such approach relies on quantum dots—a kind of artificial atom, ...

Chemist develops X-ray vision for quality assurance

7 hours ago

It is seldom sufficient to read the declaration of contents if you need to know precisely what substances a product contains. In fact, to do this you need to be a highly skilled chemist or to have genuine ...

The future of ultrashort laser pulses

7 hours ago

Rapid advances in techniques for the creation of ultra-short laser pulses promise to boost our knowledge of electron motions to an unprecedented level.

User comments : 0