Researchers develop method to predict source of network diffusion

Aug 22, 2012 by Bob Yirka report
A technique for finding the source of an epidemic with limited information was tested with data from a 2000 South African cholera epidemic, where the disease spread from village to village along a river network. Image credit: Physics 5, 89 (2012). DOI: 10.1103/Physics.5.89

(Phys.org) -- In building network models, researchers have shown it’s possible to show how information moves from a source node to many and sometime all of the nodes in a network using available data and probability calculations. Not so easy is doing the reverse, i.e. finding the source after data has already diffused throughout a network. Building a model that could do so would have innumerable applications, ranging from tracing rumors on Twitter back to the original poster to discovering where an epidemic got its start. Now new research by a team at the École Polytechnique Fédérale de Lausanne in Switzerland has shown that using techniques similar to triangulation methods that can locate an individual phone from cell towers, it’s possible to predict the source in a network using limited data sets. The team, led by Pedro Pinto, has published its findings in the journal Physical Review Letters.

To find the physical location of a single cell phone to within a few city blocks, engineers look at data from just three cell towers within which the phone is located. By noting the time stamp on the incoming data, it’s possible to deduce, or triangulate, the likely position of the phone. Pinto el al used a similar technique to narrow down the source of data in a diffused network.

The idea they say is to look at the arrival times of data to a node, be it a cell tower, a village in Africa experiencing a cholera epidemic or finding the leader of a terrorist network. Nodes in any network can be associated by drawing lines between them. The way to trace back in time then, involves following the lines that are most likely to lead to the source. Of course while that sounds easy, figuring out which lines to follow back most certainly is not, especially when there is limited or missing information, or when a network is so large looking at every node becomes impossible. That’s where the techniques the team developed come in handy. They used arrival times and probabilistic equations to derive maximum likelihood estimations to help them guess which path to take at each node.

It seems to work. They applied their modeling technique to a cholera outbreak that occurred in Africa back in 2000 and achieved an error rate of less than four hops using data from just twenty percent of the communities involved, which is of course quite impressive. Unfortunately their techniques can only be applied under certain pure conditions, i.e. when there’s a single source, when there’s only one choice at each node, etc. but that doesn’t take away from what they’ve accomplished, likely the instigation of a whole new area of network research.

Explore further: Simultaneous imaging of ferromagnetic and ferroelectric domains

More information: Locating the Source of Diffusion in Large-Scale Networks, Phys. Rev. Lett. 109, 068702 (2012). DOI:10.1103/PhysRevLett.109.068702 (ArXiv preprint)

Abstract
How can we localize the source of diffusion in a complex network? Because of the tremendous size of many real networks—such as the internet or the human social graph—it is usually unfeasible to observe the state of all nodes in a network. We show that it is fundamentally possible to estimate the location of the source from measurements collected by sparsely placed observers. We present a strategy that is optimal for arbitrary trees, achieving maximum probability of correct localization. We describe efficient implementations with complexity O(Nα), where α=1 for arbitrary trees and α=3 for arbitrary graphs. In the context of several case studies, we determine how localization accuracy is affected by various system parameters, including the structure of the network, the density of observers, and the number of observed cascades.

Related Stories

Math algorithm tracks crime, rumours, epidemics to source

Aug 10, 2012

(Phys.org) -- A team of EPFL scientists has developed an algorithm that can identify the source of an epidemic or information circulating within a network, a method that could also be used to help with criminal ...

The elusive capacity of data networks

May 15, 2012

In its early years, information theory — which grew out of a landmark 1948 paper by MIT alumnus and future professor Claude Shannon — was dominated by research on error-correcting codes: How do yo ...

Recommended for you

Hide and seek: Sterile neutrinos remain elusive

21 hours ago

The Daya Bay Collaboration, an international group of scientists studying the subtle transformations of subatomic particles called neutrinos, is publishing its first results on the search for a so-called ...

Novel approach to magnetic measurements atom-by-atom

Oct 01, 2014

Having the possibility to measure magnetic properties of materials at atomic precision is one of the important goals of today's experimental physics. Such measurement technique would give engineers and physicists an ultimate ...

Scientists demonstrate Stokes drift principle

Oct 01, 2014

In nature, waves – such as those in the ocean – begin as local oscillations in the water that spread out, ripple fashion, from their point of origin. But fans of Star Trek will recall a different sort of wave pattern: ...

User comments : 0