Facebook (and Systems Biologists) Take Note: Network Analysis Reveals True Connections

Dec 07, 2009 By Megan Fellman
A new universal method of network analysis can predict friendships in a social network and protein-protein interactions within a cell. The method also can separate interactions likely to be spurious from those likely to be correct. In the illustration, each square represents a slightly different reconstruction of a protein interaction network. The background color indicates the reliability of the reconstruction, with the darkest color indicating the best representation of the original network.

(PhysOrg.com) -- Facebook figures out that you know Holly, although you haven't seen her in 10 years, because you have four mutual friends -- a good predictor of direct friendship. But sometimes Facebook gets it wrong. "Hey, I don't know Harry!"

Roger Guimera and Marta Sales-Pardo, a husband-wife research team at Northwestern University, have developed a universal method that can accurately analyze a range of complex networks -- including social networks, protein-protein interactions and air transportation networks. Although the datasets they used were much smaller than Facebook's, the researchers demonstrated the great potential of their method.

Guimera and Sales-Pardo had wondered if one technique, exploiting the fact that all networks have groups in them and those groups are connected in many different ways, could be used to predict both friendships in a social and protein-protein interactions within a cell. They applied their mathematical and computational framework to five different networks, ranging from a group of dolphins to a network of neurons, and found one method indeed could reliably analyze all.

The details of their algorithm, which can predict missing and spurious interactions in a system, will be published in the Dec. 7 Early Edition by the (PNAS).

"The way the flu spreads, for example, is based on an underlying network, and it's important to understand the critical patterns," said Guimera, a research assistant professor of chemical and biological engineering in the McCormick School of Engineering and Applied Science. "Using available data, our method tries to find the best description of the network being analyzed, no matter what kind of network."

In the study, Guimera and Sales-Pardo tested their method on a range of five known "true" networks: a karate club, a social network of dolphins, the neural network of the worm C. elegans, the air transportation network in Eastern Europe and the metabolic network of E. coli. These networks have between 34 nodes (members of a karate club) and 604 nodes (metabolites in a metabolic network).

"Our method separates wheat from chaff, the signal from the noise," said Sales-Pardo, also a research assistant professor of chemical and biological engineering. "There are many ways to map nodes in a network, not just one. We consider all the possible ways. By taking the sum of them all, we can identify both missing and spurious connections."

A more accurate method of network analysis could help , for example, identify truly relevant connections -- with 350 million Facebook users the number of mistakes can add up quickly. Systems biology could benefit, too. The project to obtain a complete map of the millions of human protein-protein interactions has a projected cost of $1 billion but relies on techniques with accuracies (estimated in 2002) to be below 20 percent.

The central idea behind Guimera and Sales-Pardo's method is that, even though each network has unique characteristics (depending on its functional needs and evolutionary history), all networks share a remarkable property: their nodes can be classified into groups with the nodes connecting to each other depending on their group membership. In a social network, for example, people can be grouped by age, occupation, political orientation and so on. The method proceeds by averaging all possible groupings of the nodes, giving each grouping a weight that reflects its explanatory power.

For each of the five true networks, the researchers introduced errors and applied their algorithm to the distorted network. Each time, the algorithm produced a new network that reliably separated interactions likely to be spurious from those likely to be correct, without the aid of any additional information (such as the type of network or the amount of errors). Each new network reconstruction was closer to the original true network than the network containing errors and omissions.

"The flexibility of our approach, along with its generality and its performance, will make it applicable to many areas where network data reliability is a source of concern," the authors wrote.

The PNAS paper is titled "Missing and Spurious Interactions and the Reconstruction of Complex Networks."

Source: Northwestern University (news : web)

Explore further: New algorithm identifies data subsets that will yield the most reliable predictions

add to favorites email to friend print save as pdf

Related Stories

Using networks to map the social lives of animals

Sep 02, 2008

Dr Dick James from the University's Department of Physics has released a practical guide for biologists explaining how social network analysis, a method used widely in the social sciences to study interactions among people, ...

What's the semantic organization of human language?

Aug 11, 2009

A Chinese semantic network with semantic (argument structure) annotation was built and investigated for finding its global statistical properties. The results show that semantic network is also small-world and scale-free ...

New evidence on the robustness of metabolic networks

Sep 04, 2008

Biological systems are constantly evolving in ways that increase their fitness for survival amidst environmental fluctuations and internal errors. Now, in a study of cell metabolism, a Northwestern University research team ...

Recommended for you

Designing exascale computers

Jul 23, 2014

"Imagine a heart surgeon operating to repair a blocked coronary artery. Someday soon, the surgeon might run a detailed computer simulation of blood flowing through the patient's arteries, showing how millions ...

User comments : 8

Adjust slider to filter visible comments by rank

Display comments: newest first

rjm1percent
5 / 5 (1) Dec 07, 2009
For each of the five true networks, the researchers introduced errors and applied their algorithm to the distorted network. Each time, the algorithm produced a new network that reliably separated interactions likely to be spurious from those likely to be correct


On what basis does it do this? Sounds to me when introduced with many thousands of intertwining networks, this system wouldn't hold much water, as the various conformations could number in the tens of thousands. Re-publish this article when the model has undergone some substantial field trials. Too fortuitous for my liking.
fuzz54
not rated yet Dec 07, 2009
Why should they republish the article when it already has relevance to smaller order systems? You have to start somewhere. And nowhere in this article are intertwining networks mentioned. They stuck to single networks with multiple grouping characteristics in a network.
rjm1percent
not rated yet Dec 07, 2009
Well, the article is entitled: "Facebook (and Systems Biologists) Take Note"

Therefore implying a correlation between this study and its possible application in facebook updates and protein-protein interactions. But the article has little relevance because of its specificity and limited use. Wouldn't be surprised if they were analysing networks with single digit instances
chrismac
not rated yet Dec 08, 2009
In other news, scientists have discovered a drug that could potentially cure cancer, but because the initial step didn't fully cure every cancer, trials have been dropped.
rjm1percent
not rated yet Dec 08, 2009
@chrismac, link please?
rguimera
not rated yet Dec 08, 2009
rjm1percert, thanks for your comments. Maybe you should read the article (I think you may actually enjoy it) before asking us to write a new one :) You are right--the number of possible groupings of the nodes is, to all practical purposes, infinite. But the same is true for the number of microscopic configurations of the molecules of a gas, and we can still calculate thermodynamic properties using statistical mechanics. This example is more than a metaphor, the way we estimate the reliability of each link in a network is mathematically identical to how one estimates the thermodynamic properties of a gas.
rjm1percent
not rated yet Dec 08, 2009
Hi Roger,
I just read the article for the first time and you're completely correct, I thoroughly enjoyed it! But I had to stop at about the halfway point, because I got this strange itch on my neck. I'm still uncertain what the itch was, which is slightly disconcerting as it was one of those dry-itches which tend to scab quite easily. I put some cream on it and I hope it will heal up, but to what extent I am not too sure.

Could you offer any advice on this?
Thanks
rguimera
not rated yet Dec 08, 2009
I do have some advice. Wait until the paper is actually published (it has not appeared in PNAS yet) and then REALLY read it--your itch might go away.