How to Spot an Influential Paper Based on its Citations

Science journals — By factoring out the effects of timing, scientists may be able to find papers whose large numbers of citations are due to reasons such as high quality content. Credit: Wikimedia.

(PhysOrg.com) -- At first it may seem that the number of citations received by a published scientific paper is directly related to that paper's quality of content. The higher the quality, the more people read and cite that paper. However, the number of citations received by a paper depends more on when that paper was published; papers published early in a new field receive many more citations than those published later on. Although this effect has previously been known, a recent study has tested and verified the so-called "first mover advantage" with data from selected fields.

"The common-sense reasoning behind this observation is that papers published early in a field receive citations essentially regardless of content because they are the only game in town," wrote the study's author, physicist Mark Newman of the University of Michigan. "Authors feel the need to cite something and if there is only a small number of relevant publications then inevitably those publications get cited. This gives the earliest publications a head start that is subsequently amplified by the preferential attachment process and they will continue to receive citations indefinitely at a higher rate than later papers because they have more citations to begin with."

As Newman noted, the first person to investigate this issue was the "physicist-turned-historian-of-science" Derek de Solla Price. In 1965, Price presented one of the first studies of networks of citations between papers. He found that, while most papers receive only a small number of citations, a few receive a very large number of citations. Later, this distribution pattern became known as "preferential attachment," and the papers with lots of citations were found to be the early ones.

In his paper, Newman calculated the exact size of this first-mover effect (the number of citations as a function of time) within the preferential attachment model. When comparing the results with citation data from a number of fields, he found that the size and duration of the effect often agree closely with the theoretical predictions. In addition, by factoring out the effects of timing, Newman could determine which papers were cited not because they came first, but for other reasons, such as quality of content.

"If we measure a paper's citation count relative to the average in its field for the given publication date, then this effect is factored out and - perhaps - the true stars of the citation galaxy will emerge," Newman wrote.

He concludes that, on one hand, a cynical observer who wants to be highly cited would be better off writing the first paper in a new field than writing the best paper in a more mature field. But from another point of view, understanding this effect could also help readers find papers that buck the first-mover trend, as these papers could contain true breakthroughs.

More information: M.E.J. Newman. "The first-mover advantage in scientific publication." 2009 EPL 86 68001 (6pp) doi: 10.1209/0295-5075/86/68001