Model describes Web page popularity

October 20, 2010 By Lisa Zyga, feature

( -- How do some Web pages become popular? In a recent study, researchers have analyzed Wikipedia articles and a collection of all the Web pages of Chile to better understand the dynamics of online popularity. They observed that online popularity is characterized not by a gradual accumulation process, but by "bursts" that display many of the same features of critical systems, such as stock market crashes and natural phenomena. They also developed a model that captures these critical features of online popularity.

“We see that Internet behaves in unpredictable ways, with big shifts in attention causing changes which have statistical signatures like those seen in earthquakes and avalanches,” Jacob Ratkiewicz from Indiana University told

Ratkiewicz and his coauthors from Indiana University and the Institute for Scientific Interchange in Torino, Italy, have published their study on online popularity in a recent issue of . As they explain, online information that becomes popular has formidable power to impact opinions, culture, and policy, as well as earn higher advertising profits. Achieving online popularity is obviously highly desired for these reasons, but as previous studies have found, very few sites become tremendously popular.

In the researchers' analysis, the popularity of a article or Web page is expressed by the number of clicks to that page and the number of external links to that page. While previous studies have found that the popularity distribution of follows power-law behavior, it has been difficult to observe the growth in popularity of individual pages due to the lack of data with temporal information. Here, the researchers gathered the traffic data of millions of pages (3 million Wikipedia articles with a one-second time resolution during 2001-2007; 3 million Wikipedia articles with a one-hour time resolution during 2008-2010; and 3 million Web pages from Chile's .cl domain with a one-year time resolution during 2002-2006). They obtained the Wikipedia data by mining the full edit history of every article and the Chilean Web page data using the country's TODOCL search engine.

Among their results, the researchers found that almost all pages experience a burst of popularity near the beginning of their lives. Then, some pages maintain a constant exponential growth, while many other pages experience intermittent bursts. Looking at these bursts more closely, the researchers found that their distribution follows a “heavy-tail” behavior, which is a common feature of critical systems. In a heavy-tail distribution, most of the items exhibit small values, but a few items exhibit very large values that dominate the overall volume of traffic. As the researchers noted, these bursts are different from those observed in news-driven events, where attention fades rapidly; instead, sequences of bursts occur for certain Web pages and these pages accumulate popularity.

The researchers developed a ranking model that could reproduce some of the features of the popularity burst distribution, but they had to add a “reranking mechanism” to reproduce the heavy tail. The reranking mechanism randomly boosts the popularity value of a Web page, and enables the model to more closely represent the features in the actual data. Although the model is mostly descriptive, its ability to reproduce the dynamics of online popularity could lead to a better understanding of how online information becomes popular.

“We hope that deeper understanding of how popularity evolves could lead to methods for predicting things that will become popular before they actually do,” Ratkiewicz said.

“I'm not sure that this understanding could be used to legitimately improve the popularity of specific Web pages,” he added. “However, recent experience in another project of ours suggests that people are trying to exploit social media to generate bursts of attention toward specific Web sites. It's been shown that these 'twitter-bombs' can catapult a page to the top of Google search results.”

Explore further: WOWD, the real-time search engine

More information: Jacob Ratkiewicz, et al. “Characterizing and Modeling the Dynamics of Online Popularity.” Physical Review Letters 105, 158701 (2010). DOI: 10.1103/PhysRevLett.105.158701


Related Stories

WOWD, the real-time search engine

October 26, 2009

( -- The beta version of WOWD, the Internet's newest search engine, was launched last week at the 2009 Web 2.0 Summit in San Francisco. It aims to differentiate itself from other search engines such as Google ...

Do have have a herding instinct?

October 12, 2010

( -- A new study shows that consumers have a herding instinct to follow the crowd. However, this instinct appears to switch off if the product fails to achieve a certain popularity threshold.

The price of popularity: Drug and alcohol consumption

September 28, 2010

The consumption of drugs and alcohol by teenagers is not just about rebellion or emotional troubles. It's about being one of the cool kids, according to a study by led by researchers at the Université de Montréal.

Search website offers a visual alternative

January 7, 2009

Like most everyone these days, when you need to search the Internet for just about anything, you use Google. Let's face it. Google is the undisputed champion when it comes to Internet searching. It's become so mainstream ...

Recommended for you

A quantum magnet with a topological twist

February 22, 2019

Taking their name from an intricate Japanese basket pattern, kagome magnets are thought to have electronic properties that could be valuable for future quantum devices and applications. Theories predict that some electrons ...

Sculpting stable structures in pure liquids

February 21, 2019

Oscillating flow and light pulses can be used to create reconfigurable architecture in liquid crystals. Materials scientists can carefully engineer concerted microfluidic flows and localized optothermal fields to achieve ...


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.