Model describes Web page popularity

Oct 20, 2010 By Lisa Zyga feature

( -- How do some Web pages become popular? In a recent study, researchers have analyzed Wikipedia articles and a collection of all the Web pages of Chile to better understand the dynamics of online popularity. They observed that online popularity is characterized not by a gradual accumulation process, but by "bursts" that display many of the same features of critical systems, such as stock market crashes and natural phenomena. They also developed a model that captures these critical features of online popularity.

“We see that Internet behaves in unpredictable ways, with big shifts in attention causing changes which have statistical signatures like those seen in earthquakes and avalanches,” Jacob Ratkiewicz from Indiana University told

Ratkiewicz and his coauthors from Indiana University and the Institute for Scientific Interchange in Torino, Italy, have published their study on online popularity in a recent issue of . As they explain, online information that becomes popular has formidable power to impact opinions, culture, and policy, as well as earn higher advertising profits. Achieving online popularity is obviously highly desired for these reasons, but as previous studies have found, very few sites become tremendously popular.

In the researchers' analysis, the popularity of a article or Web page is expressed by the number of clicks to that page and the number of external links to that page. While previous studies have found that the popularity distribution of follows power-law behavior, it has been difficult to observe the growth in popularity of individual pages due to the lack of data with temporal information. Here, the researchers gathered the traffic data of millions of pages (3 million Wikipedia articles with a one-second time resolution during 2001-2007; 3 million Wikipedia articles with a one-hour time resolution during 2008-2010; and 3 million Web pages from Chile's .cl domain with a one-year time resolution during 2002-2006). They obtained the Wikipedia data by mining the full edit history of every article and the Chilean Web page data using the country's TODOCL search engine.

Among their results, the researchers found that almost all pages experience a burst of popularity near the beginning of their lives. Then, some pages maintain a constant exponential growth, while many other pages experience intermittent bursts. Looking at these bursts more closely, the researchers found that their distribution follows a “heavy-tail” behavior, which is a common feature of critical systems. In a heavy-tail distribution, most of the items exhibit small values, but a few items exhibit very large values that dominate the overall volume of traffic. As the researchers noted, these bursts are different from those observed in news-driven events, where attention fades rapidly; instead, sequences of bursts occur for certain Web pages and these pages accumulate popularity.

The researchers developed a ranking model that could reproduce some of the features of the popularity burst distribution, but they had to add a “reranking mechanism” to reproduce the heavy tail. The reranking mechanism randomly boosts the popularity value of a Web page, and enables the model to more closely represent the features in the actual data. Although the model is mostly descriptive, its ability to reproduce the dynamics of online popularity could lead to a better understanding of how online information becomes popular.

“We hope that deeper understanding of how popularity evolves could lead to methods for predicting things that will become popular before they actually do,” Ratkiewicz said.

“I'm not sure that this understanding could be used to legitimately improve the popularity of specific Web pages,” he added. “However, recent experience in another project of ours suggests that people are trying to exploit social media to generate bursts of attention toward specific Web sites. It's been shown that these 'twitter-bombs' can catapult a page to the top of Google search results.”

Explore further: Breakthrough in OLED technology

More information: Jacob Ratkiewicz, et al. “Characterizing and Modeling the Dynamics of Online Popularity.” Physical Review Letters 105, 158701 (2010). DOI: 10.1103/PhysRevLett.105.158701

3.2 /5 (16 votes)

Related Stories

WOWD, the real-time search engine

Oct 26, 2009

( -- The beta version of WOWD, the Internet's newest search engine, was launched last week at the 2009 Web 2.0 Summit in San Francisco. It aims to differentiate itself from other search engines ...

Do have have a herding instinct?

Oct 12, 2010

( -- A new study shows that consumers have a herding instinct to follow the crowd. However, this instinct appears to switch off if the product fails to achieve a certain popularity threshold.

The price of popularity: Drug and alcohol consumption

Sep 28, 2010

The consumption of drugs and alcohol by teenagers is not just about rebellion or emotional troubles. It's about being one of the cool kids, according to a study by led by researchers at the Université de Montréal.

Search website offers a visual alternative

Jan 07, 2009

Like most everyone these days, when you need to search the Internet for just about anything, you use Google. Let's face it. Google is the undisputed champion when it comes to Internet searching. It's become so mainstream ...

Recommended for you

Breakthrough in OLED technology

7 hours ago

Organic light emitting diodes (OLEDs), which are made from carbon-containing materials, have the potential to revolutionize future display technologies, making low-power displays so thin they'll wrap or fold ...

Throwing light on a mysterious human 'superpower'

10 hours ago

Most people, at some point in their lives, have dreamt of being able to fly like Superman or develop superhuman strength like the Hulk. But very few know that we human beings have a "superpower" of our own, ...

New filter could advance terahertz data transmission

Feb 27, 2015

University of Utah engineers have discovered a new approach for designing filters capable of separating different frequencies in the terahertz spectrum, the next generation of communications bandwidth that ...

The super-resolution revolution

Feb 27, 2015

Cambridge scientists are part of a resolution revolution. Building powerful instruments that shatter the physical limits of optical microscopy, they are beginning to watch molecular processes as they happen, ...

User comments : 0

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.