Novel high-performance hybrid system for semantic factoring of graph databases

Sep 20, 2011
Massive amounts of data can be analyzed using a novel hybrid system for semantic factoring in graph databases. Each line represents a frequent relationship type between entities in the dataset.

Imagine trying to analyze all of the English entries in Wikipedia. Now imagine you've got 20 times as much information. That's the challenge scientists face when working with gigabyte data sets. Scientists at Pacific Northwest National Laboratory, Sandia National Laboratories and Cray, Inc. developed an application to take on such massive data analysis challenges. Their novel high-performance computing application uses semantic factoring to organize data, bringing out hidden connections and threads.

The team then used their applications to analyze the massive datasets for the Billion Triple Challenge, an international competition focused on demonstrating capability and innovation for dealing with very large semantic graph databases, known as SGDs.

Why it matters? Science. Security. In both areas, people must turn massive data sets into knowledge that can be used to save lives.

As SGD technology grows to address components from extremely large data stores, it is becoming increasingly important to be able to use high-performance for analysis, interpretation, and visualization, especially as it pertains to the innate structure. However, the ability to understand the semantic structure of a vast SGD still needs both a coherent methodology and the platform to exercise the necessary methods.

The team took advantage of the Cray XMT architecture, which allowed all 624 gigabytes of input data to be held in RAM. They were then able to scalably perform a variety of novel tasks for descriptive analysis of the inherent semantics in the dataset provided by the Billion Triple Challenge, including identifying the ontological structure, the sensitivity of connectivity within the relationships, and the interaction among different contributions to the dataset.

The semantic database system research team is developing a prototype that can be adapted to a variety of application domains and datasets, including working with the bio2rdf.org and future billion-triple-challenge datasets in prototype testing and evaluation.

Explore further: Facial-recognition technology proves its mettle

More information: Joslyn C, R Adolf, S al-Saffar, J Feo, E Goodman, D Haglin, G Mackey, and D Mizell. 2010. "High Performance Semantic Factoring of Giga-Scale Semantic Graph Databases." Semantic Web Challenge Billion Triple Challenge 2010. cass-mt.pnl.gov/btc2010/pnnl_btc.pdf

add to favorites email to friend print save as pdf

Related Stories

Customizing supercomputers from the ground up

May 27, 2010

(PhysOrg.com) -- Computer scientist Adolfy Hoisie has joined the Department of Energy's Pacific Northwest National Laboratory to lead PNNL's high performance computing activities. In one such activity, Hoisie will direct ...

Enter the semantic grid

Jan 31, 2006

Working under the IST programme-funded OntoGrid project, they are catalysing the evolution of the Grid from a distributed network of computers in which the meaning of information is implicit and hidden into ...

Recommended for you

Google eyes emerging markets networks

14 hours ago

Google has become deeply involved in a series of projects to build and operate wireless networks in emerging markets including sub-Saharan Africa and Southeast Asia, a report said Friday.

Facial-recognition technology proves its mettle

16 hours ago

(Phys.org) —In a study that evaluated some of the latest in automatic facial recognition technology, researchers at Michigan State University were able to quickly identify one of the Boston Marathon bombing ...

Mobile app to help fight against racism in France

17 hours ago

A French anti-racism association is launching a mobile application it hopes will help eradicate racist graffiti by enabling users to take photos of offensive tags, geo-locate them and get them removed.

User comments : 0

More news stories

Drones may violate international law

(Phys.org) —As President Obama gives a speech on national security—including defending U.S. use of drones to combat terrorism—Leila Sadat, JD, international law expert and professor of law at Washington University in ...

Google eyes emerging markets networks

Google has become deeply involved in a series of projects to build and operate wireless networks in emerging markets including sub-Saharan Africa and Southeast Asia, a report said Friday.

Facial-recognition technology proves its mettle

(Phys.org) —In a study that evaluated some of the latest in automatic facial recognition technology, researchers at Michigan State University were able to quickly identify one of the Boston Marathon bombing ...

The long road to the 2000-watt society

The vision of a society in which each inhabitant of the earth manages to consume only 2000 watts has already been around for 15 years. During this time, there has been a steady increase in environmental awareness ...

Galaxies fed by funnels of fuel

(Phys.org) —Computer simulations of galaxies growing over billions of years have revealed a likely scenario for how they feed: a cosmic version of swirly straws.