Faster performance evaluation of super-graphs

June 19, 2017
Faster performance evaluation of 'super graphs' is shown. Credit: DGIST

Himchan Park and Min-Soo Kim of DGIST have developed TrillionG, a computer model that generates synthetic data for simulating real-world applications that use giant graphs. TrillionG is faster than currently available synthetic graph generators and uses fewer computer resources, such as memory and network bandwidth.

Graphs are used to model data in a way that clarifies the relationships between entries. Each individual entry into a is known as a node, while the relationships, or connections, between these entries are known as edges.

The need for super-graphs to process huge amounts of data is greater than ever before. Google's PageRank algorithm, which ranks websites in Google's search engine, is a good example. It represents the web as a giant graph, with nodes representing each individual web page and edges representing the links from one page to another. Facebook's Apache Giraph graph processor maps all users of the social media site with more than one billion nodes. Their connections with each other reach more than 1 trillion edges.

The performance of giant graph algorithms and systems needs to be tested, but this requires the availability of data. Real data can't be used due to privacy laws. So fabricated, or synthetic, data is required. But synthetic data does not always follow the same relational rules as real data. Also, currently available synthetic graph generators require the use of supercomputers, using several thousand server computers connected via a high speed network due to the exceptionally large amount of data being analyzed.

Park and Kim have proposed a new model for graph generation. It is a compromise between two other currently available models that require significant computational time and memory space. The new reuses data that is kept in a very compact form and in a very fast cache memory during graph generation, making it more efficient and effective than existing models.

TrillionG generates more realistic synthetic data than both previous models and can also generate larger graphs. In addition, it can generate similar-sized trillion-edge graphs in a shorter period of time (two hours) using fewer computer resources (10 standard personal computers).

"Through extensive experiments, we have demonstrated that TrillionG outperforms the state-of-the-art graph generators by up to orders of magnitude," write the researchers in their study published in the Proceedings of the 2017 ACM International Conference on Management of Data.

The team expects that TrillionG could generate synthetic graphs the size of the human brain connectome, which consists of 100 trillion connections between neurons, using 240 standard personal computers. IT companies and universities could also use large-scale synthetic graphs as an essential tool for developing and evaluating new graph algorithms and systems.

Explore further: New discovery in connectome dynamics

More information: DOI: 10.1145/3035918.3064014

Related Stories

New discovery in connectome dynamics

July 1, 2016

From diffusion tensor imaging data of the Human Connectome Project, it is possible today to construct hundreds of graphs mapping the cerebral connections of a human subject. Each of these graphs has 1015 vertices and several ...

Quick drawing of complex relationships

October 19, 2015

Quality criteria for a readable graphic representation of complex relationships are high. For example, the node points have to be located at sufficient distances in order to be identifiable. At the same time, the graph drawing ...

Explained: Graphs

December 17, 2012

When most people hear the word "graph," an image springs to mind: a pair of perpendicular lines overlaid with a line, a curve, or bars of different heights.

Enhancing the efficiency of complex computations

November 29, 2013

Planning a trip from Berlin to Hamburg, simulating air flows around a new passenger airplane, or friendships on Facebook – many computer applications model relationships between objects by graphs (networks) in the sense ...

Recommended for you

Volumetric 3-D printing builds on need for speed

December 11, 2017

While additive manufacturing (AM), commonly known as 3-D printing, is enabling engineers and scientists to build parts in configurations and designs never before possible, the impact of the technology has been limited by ...

Tech titans ramp up tools to win over children

December 10, 2017

From smartphone messaging tailored for tikes to computers for classrooms, technology titans are weaving their way into childhoods to form lifelong bonds, raising hackles of advocacy groups.

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.