June 2, 2014

Billion inserts-per-second data milestone reached for supercomputing tool

(Phys.org) —At Los Alamos, a supercomputer epicenter where "big data set" really means something, a data middleware project has achieved a milestone for specialized information organization and storage. The Multi-dimensional Hashed Indexed Middleware (MDHIM) project at Los Alamos National Laboratory recently achieved 1,782,105,749 key/value inserts per second into a globally-ordered key space on Los Alamos National Laboratory's Moonlight supercomputer.

"In the current highly parallel computing world, the need for scalability has forced the world away from fully transactional databases and back to the loosened semantics of key value stores," says Gary Grider, High Performance Computing division leader at Los Alamos.

Computer simulations overall are scaling to higher parallel-processor counts, simulating finer physical scales or more complex physical interactions. As they do so, the simulations produce ever-larger data sets that must be analyzed to yield the insights scientists need.

"This milestone was achieved by a combination of good software design and refined algorithms. Our code is available on Github and we encourage others to build upon it," said Hugh Greenberg, project leader and lead developer of the MDHIM project.

Traditionally, much data analysis has been visual; data are turned into images or movies. Statistical analysis generally occurs over the entire data set. But more detailed analysis on entire data sets is becoming untenable due to the resources required to move/search/analyze all the data at once. The ability to identify, retrieve, and analyze smaller subsets of data within the multidimensional whole would make detailed analysis much more practical. In order to do achieve this, it becomes essential to find strategies for managing these multiple dimensions of simulation data.

The MDHIM project aims to create a middle-ground framework between fully relational databases and distributed but completely local constructs like "map/reduce." MDHIM allows applications to take advantage of the mechanisms provided by a parallel key-value store: storing data in global multi-dimensional order and sub-setting of massive data in multiple dimensions as well as the functions of a distributed hash table with simple but massively parallel lookups.

Records are sorted globally in whichever number of ways an application chooses. Applications can choose to implement, via the MDHIM library, anything from a shared-nothing map/reduce-style functionality to deeply indexed data with rich information about statistical distributions within all keys. This allows global statistical analysis and retrieval of relevant data subsets for further analysis.

MDHIM is designed to represent petabytes of data with mega- to gigabytes of representation data, utilizing the natural advantages of HPC interconnects—low latency, high bandwidth, and collective-friendliness—to scale key/value service to millions of cores, implying a need for billions of inserts per second.

In this sample scaling run, MDHIM ran as an MPI library on 3360 processors within 280 nodes of the 308-node Moonlight system in demonstrating nearly two billion inserts per second.

MDHIM is a framework on which an application can run thousands of copies of existing key value stores, in multiple programming environments, exploiting the capabilities of an extreme scale computing system. MDHIM, which is sponsored by the U.S. Department of Defense, is being used extensively within the Storage and I/O portion of the DOE FastForward project, which has as its objective "to initiate partnerships with multiple companies to accelerate the R&D of critical component technologies needed for extreme-scale computing."

Provided by Los Alamos National Laboratory

Citation: Billion inserts-per-second data milestone reached for supercomputing tool (2014, June 2) retrieved 16 August 2024 from https://phys.org/news/2014-06-billion-inserts-per-second-milestone-supercomputing-tool.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Multidimensional image processing and analysis in R

0 shares

Feedback to editors

Soundscape study shows how underground acoustics can amplify soil health

1 hour ago

Scottish and Irish rocks confirmed as rare record of 'snowball Earth'

6 hours ago

Blind cavefish have extraordinary taste buds that increase with age, research reveals

8 hours ago

'Mercury bomb' threatens millions as Arctic temperatures rise, study warns

9 hours ago

Team develops method for control over single-molecule photoswitching

10 hours ago

X-ray irradiation technique helps to control cancer-causing poison in corn

10 hours ago

Physicists uncover new phenomena in fractional quantum Hall effects

10 hours ago

Researchers observe 'locked' electron pairs in a superconductor cuprate

11 hours ago

Scientists discover superbug's rapid path to antibiotic resistance

11 hours ago

Why do researchers often prefer safe over risky projects? Explaining risk aversion in science

11 hours ago

Load comments (0)

Billion inserts-per-second data milestone reached for supercomputing tool

Soundscape study shows how underground acoustics can amplify soil health

Scottish and Irish rocks confirmed as rare record of 'snowball Earth'

Blind cavefish have extraordinary taste buds that increase with age, research reveals

'Mercury bomb' threatens millions as Arctic temperatures rise, study warns

Team develops method for control over single-molecule photoswitching

X-ray irradiation technique helps to control cancer-causing poison in corn

Physicists uncover new phenomena in fractional quantum Hall effects

Researchers observe 'locked' electron pairs in a superconductor cuprate

Scientists discover superbug's rapid path to antibiotic resistance

Why do researchers often prefer safe over risky projects? Explaining risk aversion in science

Relevant PhysicsForums posts

Python Socket library to create a server and client scripts

Safe, free and unlimited xls to xlsx converter?

Help solving a geometrical matching issue with Graph Neural Networks

5 GHz PC WiFi connection Cybersecurity question

Help with some optimization code for Block Matrices

Is an API Always Necessary for Server-Client Communication?

Multidimensional image processing and analysis in R

Customizing supercomputers from the ground up

First-of-a-kind supercomputer at Lawrence Livermore available for collaborative research

A toolbox to simulate the big bang and beyond

Better chemistry through parallel in time algorithms

ARCHER supercomputer targets research solutions on epic scale

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Medical Xpress

Tech Xplore

Science X

Billion inserts-per-second data milestone reached for supercomputing tool

Soundscape study shows how underground acoustics can amplify soil health

Scottish and Irish rocks confirmed as rare record of 'snowball Earth'

Blind cavefish have extraordinary taste buds that increase with age, research reveals

'Mercury bomb' threatens millions as Arctic temperatures rise, study warns

Team develops method for control over single-molecule photoswitching

X-ray irradiation technique helps to control cancer-causing poison in corn

Physicists uncover new phenomena in fractional quantum Hall effects

Researchers observe 'locked' electron pairs in a superconductor cuprate

Scientists discover superbug's rapid path to antibiotic resistance

Why do researchers often prefer safe over risky projects? Explaining risk aversion in science

Relevant PhysicsForums posts

Related Stories

Multidimensional image processing and analysis in R

Customizing supercomputers from the ground up

First-of-a-kind supercomputer at Lawrence Livermore available for collaborative research

A toolbox to simulate the big bang and beyond

Better chemistry through parallel in time algorithms

ARCHER supercomputer targets research solutions on epic scale

Recommended for you

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Newsletter sign up

Donate and enjoy an ad-free experience