Novel high-performance hybrid system for semantic factoring of graph databases

Sep 20, 2011
Massive amounts of data can be analyzed using a novel hybrid system for semantic factoring in graph databases. Each line represents a frequent relationship type between entities in the dataset.

Imagine trying to analyze all of the English entries in Wikipedia. Now imagine you've got 20 times as much information. That's the challenge scientists face when working with gigabyte data sets. Scientists at Pacific Northwest National Laboratory, Sandia National Laboratories and Cray, Inc. developed an application to take on such massive data analysis challenges. Their novel high-performance computing application uses semantic factoring to organize data, bringing out hidden connections and threads.

The team then used their applications to analyze the massive datasets for the Billion Triple Challenge, an international competition focused on demonstrating capability and innovation for dealing with very large semantic graph databases, known as SGDs.

Why it matters? Science. Security. In both areas, people must turn massive data sets into knowledge that can be used to save lives.

As SGD technology grows to address components from extremely large data stores, it is becoming increasingly important to be able to use high-performance for analysis, interpretation, and visualization, especially as it pertains to the innate structure. However, the ability to understand the semantic structure of a vast SGD still needs both a coherent methodology and the platform to exercise the necessary methods.

The team took advantage of the Cray XMT architecture, which allowed all 624 gigabytes of input data to be held in RAM. They were then able to scalably perform a variety of novel tasks for descriptive analysis of the inherent semantics in the dataset provided by the Billion Triple Challenge, including identifying the ontological structure, the sensitivity of connectivity within the relationships, and the interaction among different contributions to the dataset.

The semantic database system research team is developing a prototype that can be adapted to a variety of application domains and datasets, including working with the bio2rdf.org and future billion-triple-challenge datasets in prototype testing and evaluation.

Explore further: Computer-assisted accelerator design

More information: Joslyn C, R Adolf, S al-Saffar, J Feo, E Goodman, D Haglin, G Mackey, and D Mizell. 2010. "High Performance Semantic Factoring of Giga-Scale Semantic Graph Databases." Semantic Web Challenge Billion Triple Challenge 2010. cass-mt.pnl.gov/btc2010/pnnl_btc.pdf

add to favorites email to friend print save as pdf

Related Stories

Customizing supercomputers from the ground up

May 27, 2010

(PhysOrg.com) -- Computer scientist Adolfy Hoisie has joined the Department of Energy's Pacific Northwest National Laboratory to lead PNNL's high performance computing activities. In one such activity, Hoisie will direct ...

Enter the semantic grid

Jan 31, 2006

Working under the IST programme-funded OntoGrid project, they are catalysing the evolution of the Grid from a distributed network of computers in which the meaning of information is implicit and hidden into ...

Recommended for you

Computer-assisted accelerator design

20 hours ago

Stephen Brooks uses his own custom software tool to fire electron beams into a virtual model of proposed accelerator designs for eRHIC. The goal: Keep the cost down and be sure the beams will circulate in ...

First steps towards "Experimental Literature 2.0"

Apr 21, 2014

As part of a student's thesis, the Laboratory of Digital Humanities at EPFL has developed an application that aims at rearranging literary works by changing their chapter order. "The human simulation" a saga ...

User comments : 0

More news stories

Old tires become material for new and improved roads

(Phys.org) —Americans generate nearly 300 million scrap tires every year, according to the Environmental Protection Agency (EPA). Historically, these worn tires often end up in landfills or, when illegally ...

Mantis shrimp stronger than airplanes

(Phys.org) —Inspired by the fist-like club of a mantis shrimp, a team of researchers led by University of California, Riverside, in collaboration with University of Southern California and Purdue University, ...

Volitional control from optical signals

(Medical Xpress)—In their quest to build better BMIs, or brain-machine-interfaces, researchers have recently begun to look closer at the sub-threshold activity of neurons. The reason for this trend is that ...