December 28, 2016

New data-mining strategy that offers unprecedented pattern search speed could glean new insights from massive datasets

by King Abdullah University of Science and Technology

Divide and conquer pattern searching — A new data mining strategy that offers unprecedented pattern search speed could lead to new insights from massive data. Credit: © Mopic / Alamy Stock Photo DTFTEM

Searching for recurring patterns in network systems has become a fundamental part of research and discovery in fields as diverse as biology and social media. KAUST researchers have developed a pattern or graph-mining framework that promises to significantly speed up searches on massive network data sets.

"A graph is a data structure that models complex relationships among objects," explained Panagiotis Kalnis, leader of the research team from the KAUST Extreme Computing Research Center. "Graphs are widely used in many modern applications, including social networks, biological networks like protein-to-protein interactions, and communication networks like the internet."

In these applications, one of the most important operations is the process of finding recurring graphs that reveal how objects tend to connect to each other. The process, which is called frequent subgraph mining (FSM), is an essential building block of many knowledge extraction techniques in social studies, bioinformatics and image processing, as well as in security and fraud detection. However, graphs may contain hundreds of millions of objects and billions of relationships, which means that extracting recurring patterns places huge demands on time and computing resources.

"In essence, if we can provide a better algorithm, all the applications that depend on FSM will be able to perform deeper analysis on larger data in less time," Kalnis noted.

Kalnis and his colleagues developed a system called ScaleMine that offers a ten-fold acceleration compared with existing methods.

"FSM involves a vast number of graph operations, each of which is computationally expensive, so the only practical way to support FSM in large graphs is by massively parallel computation," he said.

In parallel computing, the graph search is divided into multiple tasks and each is run simultaneously on its own processor. If the tasks are too large, the entire search is held up by waiting for the slowest task to complete; if the tasks are too small, the extra communication needed to coordinate the parallelization becomes a significant additional computational load.

Kalnis' team overcame this limitation by performing the search in two steps: a first approximation step to determine the search space and the optimal division of tasks and a second computational step in which large tasks are split dynamically into the optimal number of subtasks. This resulted in search speeds up to ten times faster than previously possible.

"Hopefully this performance improvement will enable deeper and more accurate analysis of large graph data and the extraction of new knowledge," Kalnis said.

Provided by King Abdullah University of Science and Technology

Citation: New data-mining strategy that offers unprecedented pattern search speed could glean new insights from massive datasets (2016, December 28) retrieved 21 July 2024 from https://phys.org/news/2016-12-data-mining-strategy-unprecedented-pattern-glean.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Graphics processors accelerate pattern discovery

20 shares

Feedback to editors

New genetic test can help eliminate a form of inherited blindness in dogs

2 hours ago

Saturday Citations: Scientists study monkey faces and cat bellies; another intermediate black hole in the Milky Way

Jul 20, 2024

Researchers zero in on the underlying mechanism that causes alloys to crack when exposed to hydrogen-rich environments

Jul 19, 2024

International study highlights large and unequal life expectancy declines in India during COVID-19

Jul 19, 2024

Global study demonstrates benefit of marine protected areas to recreational fisheries

Jul 19, 2024

Killifish can adjust their egg-laying habits in response to predators, study shows

Jul 19, 2024

Enhanced information in national policies can accelerate Africa's efforts to track climate adaptation

Jul 19, 2024

Innovative microscopy reveals amyloid architecture, may give insights into neurodegenerative disease

Jul 19, 2024

Study deciphers intricate 3D structure of DNA aptamer for disease theranostics

Jul 19, 2024

Gold co-catalyst improves photocatalytic degradation of micropollutants, finds study

Jul 19, 2024

Load comments (0)

New data-mining strategy that offers unprecedented pattern search speed could glean new insights from massive datasets

New genetic test can help eliminate a form of inherited blindness in dogs

Saturday Citations: Scientists study monkey faces and cat bellies; another intermediate black hole in the Milky Way

Researchers zero in on the underlying mechanism that causes alloys to crack when exposed to hydrogen-rich environments

International study highlights large and unequal life expectancy declines in India during COVID-19

Global study demonstrates benefit of marine protected areas to recreational fisheries

Killifish can adjust their egg-laying habits in response to predators, study shows

Enhanced information in national policies can accelerate Africa's efforts to track climate adaptation

Innovative microscopy reveals amyloid architecture, may give insights into neurodegenerative disease

Study deciphers intricate 3D structure of DNA aptamer for disease theranostics

Gold co-catalyst improves photocatalytic degradation of micropollutants, finds study

Relevant PhysicsForums posts

Particle.js: Exploring Particle Physics with Web Technologies

Help solving a geometrical matching issue with Graph Neural Networks

5 GHz PC WiFi connection Cybersecurity question

Help with some optimization code for Block Matrices

Is an API Always Necessary for Server-Client Communication?

I did this POST message configuration damage to my wifi internet, help

Graphics processors accelerate pattern discovery

Quick drawing of complex relationships

Novel graph method detects cyber-attack patterns in complex computing networks

Who's the most influential in a social graph? New software recognizes key influencers faster than ever

Worldwide quantum web may be possible with help from graphs

Dynamic graph analytics tackle social media and other big data

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Medical Xpress

Tech Xplore

Science X

New data-mining strategy that offers unprecedented pattern search speed could glean new insights from massive datasets

New genetic test can help eliminate a form of inherited blindness in dogs

Saturday Citations: Scientists study monkey faces and cat bellies; another intermediate black hole in the Milky Way

Researchers zero in on the underlying mechanism that causes alloys to crack when exposed to hydrogen-rich environments

International study highlights large and unequal life expectancy declines in India during COVID-19

Global study demonstrates benefit of marine protected areas to recreational fisheries

Killifish can adjust their egg-laying habits in response to predators, study shows

Enhanced information in national policies can accelerate Africa's efforts to track climate adaptation

Innovative microscopy reveals amyloid architecture, may give insights into neurodegenerative disease

Study deciphers intricate 3D structure of DNA aptamer for disease theranostics

Gold co-catalyst improves photocatalytic degradation of micropollutants, finds study

Relevant PhysicsForums posts

Related Stories

Graphics processors accelerate pattern discovery

Quick drawing of complex relationships

Novel graph method detects cyber-attack patterns in complex computing networks

Who's the most influential in a social graph? New software recognizes key influencers faster than ever

Worldwide quantum web may be possible with help from graphs

Dynamic graph analytics tackle social media and other big data

Recommended for you

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Newsletter sign up

Donate and enjoy an ad-free experience