Researchers demonstrate breakthrough storage performance for big data applications

July 22, 2011

Researchers from IBM today demonstrated the future of large-scale storage systems by successfully scanning 10 billion files on a single system in just 43 minutes, shattering the previous record of one billion files in three hours by a factor of 37.

Growing at unprecedented scales, this advance unifies data environments on a single platform, instead of being distributed across several systems that must be separately managed. It also dramatically reduces and simplifies data management tasks, allowing more information to be stored in the same technology, rather than continuing to buy more and more storage.

In 1998, IBM Researchers unveiled a highly scalable, clustered parallel file system called General Parallel File System (GPFS), which was furthered tuned to make this breakthrough possible. GPFS represents a major advance of scaling for storage performance and capacity, while keeping management costs flat. This innovation could help organizations cope with the exploding growth of data, transactions and digitally-aware sensors and other devices that comprise Smarter Planet systems. It is ideally suited for applications requiring high-speed access to large volumes of data such as data mining to determine customer buying behaviors across sets, seismic data processing, risk management and financial analysis, weather modeling and scientific research.

Driving New Levels of Storage Performance

Today's breakthrough was achieved using GPFS running on a cluster of 10 eight core systems and solid state storage, taking 43 minutes to perform this selection. The GPFS management rules engine provides the comprehensive capabilities to service any data management task.

GPFS's advanced algorithm makes possible the full use of all on all of these machines in all phases of the task (data read, sorting and rules evaluation).

GPFS exploits the solid state storage appliances with only 6.8 terabytes of capacity for excellent random performance and high data transfer rates for containing the metadata storage. The appliances sustainably perform hundreds of millions of data input-output operations, while GPFS continuously identifies, selects and sorts the right set of files among the 10 billion on the system.

"Today's demonstration of GPFS scalability will pave the way for new products that address the challenges of a rapidly growing, multi-zettabyte world," said Doug Balog, vice president, storage platforms, IBM. "This has the potential to enable much larger data environments to be unified on a single platform and dramatically reduce and simplify data management tasks such as data placement, aging, backup and migration of individual files."

The previous record was also set by IBM researchers at the Supercomputing 2007 conference in Reno, NV, where they demonstrated the ability to scan one billion files in three hours.

"Businesses in every industry are looking to the future of storage and data management as we face a problem springing from the very core of our success – managing the massive amounts of data we create on a daily basis," said Bruce Hillsberg, director of , IBM Research – Almaden. "From banking systems to MRIs and traffic sensors, our day-to-day lives are engulfed in data. But, it can only be useful if it is effectively stored, analyzed and applied, and businesses and governments have relied on smarter technology systems as the means to manage and leverage the constant influx of data and turn it into valuable insights."

IBM Research continues to develop innovative storage technologies to help clients not only manage data proliferation, but harness data to create new services. In the past year alone, IBM storage products included over five significant storage innovations invented by IBM Research including IBM Easy Tier, Storwize V7000, Scale-out Network Attached Storage (SONAS), IBM Information Archive and IBM Long Term File System (LTFS).

As the size of digital data increased 47 percent over last year, businesses are under tremendous pressure to quickly turn data into actionable insights, but grapple with how to manage and store it all. As new applications emerge in industries from financial services to healthcare, traditional data management systems will be unable to perform common but critical storage management tasks, leaving organizations exposed to critical data loss.

Anticipating these challenges decades ago, researchers from IBM Research – Almaden created GPFS to help businesses cope with the exploding growth of data, transactions and digitally-aware devices on a single system. Already deployed to perform tasks like backup, information lifecycle management, disaster recovery and content distribution, this technology's unique approach overcomes the challenge of managing unprecedented large file systems with the combination of multi-system parallelization and fast access to file system metadata stored on a appliance.

More information: Additional details on the breakthrough can be found here.

Provided by IBM


Rank 4 /5 (5 votes)
Relevant PhysicsForums posts
  • Ideas to mitigate risk of 911 calls being misdirected
    createdMay 24, 2012
  • Live scribe pen?
    createdMay 10, 2012
  • Shallow water flow simulation
    createdMay 07, 2012
  • Tablet for taking notes?
    createdMay 05, 2012
  • Best fit tablet for me?
    createdMay 05, 2012
  • Measure of Informaton
    createdMay 04, 2012
  • More from Physics Forums - Computing & Technology

More news stories

Browser wars flare in mobile space

The browser wars are heating up again, but this time the fight is for dominance of the mobile Internet.

Technology / Software

created 3 hours ago | popularity 5 / 5 (1) | comments 2

Probability of contamination from severe nuclear reactor accidents is higher than expected: study

Catastrophic nuclear accidents such as the core meltdowns in Chernobyl and Fukushima are more likely to happen than previously assumed. Based on the operating hours of all civil nuclear reactors and the number ...

Technology / Energy & Green Tech

created May 22, 2012 | popularity 3.6 / 5 (21) | comments 56 | with audio podcast

SpotterRF debuts Radar Backpack Kit (w/ Video)

(Phys.org) -- SpotterRF has announced a special radar backpack kit designed to enhance situational awareness for soldiers on the ground. The company says its special radar is designed for warfighters as part ...

Technology / Hi Tech & Innovation

created May 26, 2012 | popularity 5 / 5 (5) | comments 12 | with audio podcast report

HyperSolar shows dirty water no barrier to power world

(Phys.org) -- The Santa Barbara, California, company, HyperSolar, is set to transparently share the ups and downs of its research experiences toward the company’s ultimate vision, successfully producing ...

Technology / Energy & Green Tech

created May 24, 2012 | popularity 4.8 / 5 (15) | comments 17 | with audio podcast report

Tesla to launch electric sedan in US on June 22

Tesla Motors said Tuesday it would begin deliveries of "the world's first premium electric sedan" on June 22, slightly ahead of schedule.

Technology / Energy & Green Tech

created May 22, 2012 | popularity 4.5 / 5 (11) | comments 18


Nvidia trumpets Tegra 3 phone design wins for 2012

(Phys.org) -- Nvidia’s competitive war paint has a name, Tegra 3. On the heels of Nvidia announcements about lowering costs of its Tegra 3 processors and Nvidia-enabled tablets running Android Ice Cream ...

Scientist: Evolution debate will soon be history

(AP) -- Richard Leakey predicts skepticism over evolution will soon be history. Not that the avowed atheist has any doubts himself.

Dell tablet leak: 10.1-inch display, two-battery choice

(Phys.org) -- Headline after headline talks about vendors’ tablets in the wings as likely number-one contenders for the iPad. Such claims have justifiably been taken with a grain of salt, considering ...

Keep food safety in mind this memorial day weekend

(HealthDay) -- Picnics, parades and cookouts are as much a part of Memorial Day weekend as tributes to the United States' war veterans.

Family history of Alzheimer's affects functional connectivity

(HealthDay) -- Cognitively normal individuals with a family history of late-onset Alzheimer's disease (AD) may display lower resting state functional connectivity in the default mode network (DMN) of the brain, ...

Social welfare cuts ultimately come with heavy price, researchers say

(Phys.org) -- Slashing government funding for Medicaid, food stamps and other programs that serve the poor – while politically popular with some lawmakers and many conservatives – may do more harm ...