Researchers develop method to predict source of network diffusion

Aug 22, 2012 by Bob Yirka report
A technique for finding the source of an epidemic with limited information was tested with data from a 2000 South African cholera epidemic, where the disease spread from village to village along a river network. Image credit: Physics 5, 89 (2012). DOI: 10.1103/Physics.5.89

( -- In building network models, researchers have shown it’s possible to show how information moves from a source node to many and sometime all of the nodes in a network using available data and probability calculations. Not so easy is doing the reverse, i.e. finding the source after data has already diffused throughout a network. Building a model that could do so would have innumerable applications, ranging from tracing rumors on Twitter back to the original poster to discovering where an epidemic got its start. Now new research by a team at the École Polytechnique Fédérale de Lausanne in Switzerland has shown that using techniques similar to triangulation methods that can locate an individual phone from cell towers, it’s possible to predict the source in a network using limited data sets. The team, led by Pedro Pinto, has published its findings in the journal Physical Review Letters.

To find the physical location of a single cell phone to within a few city blocks, engineers look at data from just three cell towers within which the phone is located. By noting the time stamp on the incoming data, it’s possible to deduce, or triangulate, the likely position of the phone. Pinto el al used a similar technique to narrow down the source of data in a diffused network.

The idea they say is to look at the arrival times of data to a node, be it a cell tower, a village in Africa experiencing a cholera epidemic or finding the leader of a terrorist network. Nodes in any network can be associated by drawing lines between them. The way to trace back in time then, involves following the lines that are most likely to lead to the source. Of course while that sounds easy, figuring out which lines to follow back most certainly is not, especially when there is limited or missing information, or when a network is so large looking at every node becomes impossible. That’s where the techniques the team developed come in handy. They used arrival times and probabilistic equations to derive maximum likelihood estimations to help them guess which path to take at each node.

It seems to work. They applied their modeling technique to a cholera outbreak that occurred in Africa back in 2000 and achieved an error rate of less than four hops using data from just twenty percent of the communities involved, which is of course quite impressive. Unfortunately their techniques can only be applied under certain pure conditions, i.e. when there’s a single source, when there’s only one choice at each node, etc. but that doesn’t take away from what they’ve accomplished, likely the instigation of a whole new area of network research.

Explore further: Argonne research expanding from injectors to inhalers

More information: Locating the Source of Diffusion in Large-Scale Networks, Phys. Rev. Lett. 109, 068702 (2012). DOI:10.1103/PhysRevLett.109.068702 (ArXiv preprint)

How can we localize the source of diffusion in a complex network? Because of the tremendous size of many real networks—such as the internet or the human social graph—it is usually unfeasible to observe the state of all nodes in a network. We show that it is fundamentally possible to estimate the location of the source from measurements collected by sparsely placed observers. We present a strategy that is optimal for arbitrary trees, achieving maximum probability of correct localization. We describe efficient implementations with complexity O(Nα), where α=1 for arbitrary trees and α=3 for arbitrary graphs. In the context of several case studies, we determine how localization accuracy is affected by various system parameters, including the structure of the network, the density of observers, and the number of observed cascades.

Related Stories

Math algorithm tracks crime, rumours, epidemics to source

Aug 10, 2012

( -- A team of EPFL scientists has developed an algorithm that can identify the source of an epidemic or information circulating within a network, a method that could also be used to help with criminal ...

The elusive capacity of data networks

May 15, 2012

In its early years, information theory — which grew out of a landmark 1948 paper by MIT alumnus and future professor Claude Shannon — was dominated by research on error-correcting codes: How do yo ...

Recommended for you

Argonne research expanding from injectors to inhalers

10 hours ago

There is a world of difference between tailpipes and windpipes, but researchers at the Department of Energy's Argonne National Laboratory have managed to link the two with groundbreaking research that could ...

Pennies reveal new insights on the nature of randomness

16 hours ago

The concept of randomness appears across scientific disciplines, from materials science to molecular biology. Now, theoretical chemists at Princeton have challenged traditional interpretations of randomness ...

User comments : 0

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.