Researchers develop method to predict source of network diffusion

Aug 22, 2012 by Bob Yirka report
A technique for finding the source of an epidemic with limited information was tested with data from a 2000 South African cholera epidemic, where the disease spread from village to village along a river network. Image credit: Physics 5, 89 (2012). DOI: 10.1103/Physics.5.89

( -- In building network models, researchers have shown it’s possible to show how information moves from a source node to many and sometime all of the nodes in a network using available data and probability calculations. Not so easy is doing the reverse, i.e. finding the source after data has already diffused throughout a network. Building a model that could do so would have innumerable applications, ranging from tracing rumors on Twitter back to the original poster to discovering where an epidemic got its start. Now new research by a team at the École Polytechnique Fédérale de Lausanne in Switzerland has shown that using techniques similar to triangulation methods that can locate an individual phone from cell towers, it’s possible to predict the source in a network using limited data sets. The team, led by Pedro Pinto, has published its findings in the journal Physical Review Letters.

To find the physical location of a single cell phone to within a few city blocks, engineers look at data from just three cell towers within which the phone is located. By noting the time stamp on the incoming data, it’s possible to deduce, or triangulate, the likely position of the phone. Pinto el al used a similar technique to narrow down the source of data in a diffused network.

The idea they say is to look at the arrival times of data to a node, be it a cell tower, a village in Africa experiencing a cholera epidemic or finding the leader of a terrorist network. Nodes in any network can be associated by drawing lines between them. The way to trace back in time then, involves following the lines that are most likely to lead to the source. Of course while that sounds easy, figuring out which lines to follow back most certainly is not, especially when there is limited or missing information, or when a network is so large looking at every node becomes impossible. That’s where the techniques the team developed come in handy. They used arrival times and probabilistic equations to derive maximum likelihood estimations to help them guess which path to take at each node.

It seems to work. They applied their modeling technique to a cholera outbreak that occurred in Africa back in 2000 and achieved an error rate of less than four hops using data from just twenty percent of the communities involved, which is of course quite impressive. Unfortunately their techniques can only be applied under certain pure conditions, i.e. when there’s a single source, when there’s only one choice at each node, etc. but that doesn’t take away from what they’ve accomplished, likely the instigation of a whole new area of network research.

Explore further: Do we live in a 2-D hologram? New Fermilab experiment will test the nature of the universe

More information: Locating the Source of Diffusion in Large-Scale Networks, Phys. Rev. Lett. 109, 068702 (2012). DOI:10.1103/PhysRevLett.109.068702 (ArXiv preprint)

How can we localize the source of diffusion in a complex network? Because of the tremendous size of many real networks—such as the internet or the human social graph—it is usually unfeasible to observe the state of all nodes in a network. We show that it is fundamentally possible to estimate the location of the source from measurements collected by sparsely placed observers. We present a strategy that is optimal for arbitrary trees, achieving maximum probability of correct localization. We describe efficient implementations with complexity O(Nα), where α=1 for arbitrary trees and α=3 for arbitrary graphs. In the context of several case studies, we determine how localization accuracy is affected by various system parameters, including the structure of the network, the density of observers, and the number of observed cascades.

Related Stories

Math algorithm tracks crime, rumours, epidemics to source

Aug 10, 2012

( -- A team of EPFL scientists has developed an algorithm that can identify the source of an epidemic or information circulating within a network, a method that could also be used to help with criminal ...

The elusive capacity of data networks

May 15, 2012

In its early years, information theory — which grew out of a landmark 1948 paper by MIT alumnus and future professor Claude Shannon — was dominated by research on error-correcting codes: How do yo ...

Recommended for you

Calculating conditions at the birth of the universe

21 hours ago

( —Using a calculation originally proposed seven years ago to be performed on a petaflop computer, Lawrence Livermore researchers computed conditions that simulate the birth of the universe.

Tilted acoustic tweezers separate cells gently

Aug 25, 2014

Precise, gentle and efficient cell separation from a device the size of a cell phone may be possible thanks to tilt-angle standing surface acoustic waves, according to a team of engineers.

A single diamond crystal does the job

Aug 25, 2014

( —X-ray absorption spectroscopy (XAS) is a technique used in many areas of science, from biology to materials science,that allows researchers to uncover information on a sample's molecular structure ...

User comments : 0