How science can beat the flawed metric that rules it

Jul 30, 2014 by Nikolaus Kriegeskorte
Normal day for a scientist at work.

In order to improve something, we need to be able to measure its quality. This is true in public policy, in commercial industries, and also in science. Like other fields, science has a growing need for quantitative evaluation of its products: scientific studies. However, the dominant metric used for this purpose is widely considered to be flawed. It is the journal impact factor.

The impact factor is a measure of how many times recent papers from a particular scientific journal are cited in other scientific papers. Journals with a high impact factor enjoy prestige. Scientists compete to publish their work there, because this boosts their reputation and funding opportunities. In order to be published in such journals, a paper needs to pass prepublication peer review, a process in which two to four anonymous scientists evaluate its quality.

The impact factor creates a powerful social reality, in which a paper published in a journal with higher impact factor is a better paper and a scientist publishing in such journals is a better scientist.

Because the impact factor is based on the number of times all recent papers in a journal are cited, it is widely understood to provide a poor indication of the quality of each individual paper appearing in that journal. It is not just scientists, but also many journal editors and publishers who object to this metric. We have come to a point where the impact factor is almost universally rejected, and embracing it would pose a bit of a risk to your status in academic circles.

Throwing away a bad map

In the San Francisco Declaration on Research Assessment (DORA), editors, publishers, and scientists recommend against the use of journal-based metrics, such as the impact factor, as indicators of the quality of individual papers. Some of the signatories recently reported on DORA in The Conversation. Scientists, such as open-access pioneer, Michael Eisen, recent Nobel laureate Randy Schekman and science blogger Dorothy Bishop have similarly been calling for impact factors to be ignored when the quality of research is assessed.

At the same time, however, almost every scientist relies on the metric (or the prestige it confers to a journal) when selecting what to read. "How do you choose what to read?" is one of the more embarrassing questions to drop on a scientist.

Despite its flaws, scientists will rely on the impact factor as long as they have no better indication of the reliability and importance of new scientific papers. When deciding which of two new papers to read (assuming they are equally relevant, and we don't know the authors), most of us will prefer the one that appeared in the journal with the higher impact factor. Assessments of the overall scientific contribution of a scientist or department, similarly, rarely ignore this metric.

It is unrealistic to suggest that a committee deciding who to hire or fund should replicate the assessment of individual papers already performed by peer review. Such committees are typically under considerable time pressure. If they are to make a good decision, they will need to use all available evidence to estimate the quality of the work. The impact factor is unreliable. However, direct assessment of the applicant's work will similarly be compromised by the limits of the committee's time and expertise.

Given a choice between a bad map and no map at all, a rational person will choose the bad map. Asking people to ignore the only indication to the quality of recent scientific papers we currently have in favour of "judging by the content" is like saying that we shouldn't choose what books to read without even having read them.

How to beat the impact factor

The only way to beat the impact factor is to provide a better evaluation signal for new scientific papers.

When a paper is published, it is read and judged in private by experts in the field who work on related questions. All we need to do to beat the impact factor is sample those expert judgements and combine them into numerical evaluations that reflect peer opinion on the reliability and importance of individual scientific papers. Such a process of open evaluation would provide ratings that are specific to each paper and combine a larger number of expert opinions than traditional peer review can. The process could also benefit from post-publication commentary.

An open evaluation system will need to be more complex than Facebook likes or product ratings on Amazon. We will need multiple rating scales, at least two: for reliability and importance. We will also need to enable scientists to sign their ratings with digital authentication. Signed judgements will be essential to ensure that the system is trustworthy and transparent. An average of even just a dozen signed ratings by renowned experts would almost certainly provide a better evaluation signal and could free us from our dependence on the .

Time for change

Scientific publishing is currently in a state of flux. Recent developments point in the right direction, although they do not go far enough. These include: Pubmed Commons, PLoS Open Evaluation, Altmetric, and a large number of new start-up companies, such as PubPeer, ScienceOpen, the Winnower, and many others. Eventually, we might want to consolidate the open evaluation process into a single system, which should ideally be publicly funded and entirely transparent.

The evaluation of scientific papers steers the direction of each field of science, and – beyond science – guides real-world applications and public policy. If papers had reliable ratings, science would progress with a surer step. Only findings found to be reliable and important by a broad peer evaluation process would be widely publicised, thus improving the impact of science on society.

The perceived importance of a scientific paper should reflect the deepest wisdom of the scientific community, rather than the judgements of three anonymous peer reviewers. It is time scientists took charge of the evaluation process. Open evaluation will mean a fundamental change of the culture of science toward openness, transparency, and constructive criticism. We are slowly realising that the rules of the game are ultimately up to us, and taking on the creative challenge to change them.

Explore further: What lesson do rising retraction rates hold for peer review?

add to favorites email to friend print save as pdf

Related Stories

Nobel winning scientist to boycott top science journals

Dec 10, 2013

( —Randy Schekman winner (with colleagues) of the Nobel Prize this year in the Physiology or Medicine category for his work that involved describing how materials are carried to different parts ...

Recommended for you

Color and texture matter most when it comes to tomatoes

16 hours ago

A new study in the Journal of Food Science, published by the Institute of Food Technologists (IFT), evaluated consumers' choice in fresh tomato selection and revealed which characteristics make the red fruit most appealing.

How the lotus got its own administration

19 hours ago

Actually the lotus is a very ordinary plant. Nevertheless, during the Qing dynasty (1644-1911) a complex bureaucratic structure was built up around this plant. The lotus was part of the Imperial Household, ...

What labels on textiles can tell us about society

20 hours ago

Throughout Chinese history, dynastic states used labels on textiles to spread information on the maker, the commissioner, the owner or the date and site of production. Silks produced in state-owned manufacture ...

US company sells out of Ebola toys

Oct 17, 2014

They might look tasteless, but satisfied customers dub them cute and adorable. Ebola-themed toys have proved such a hit that one US-based company has sold out.

User comments : 19

Adjust slider to filter visible comments by rank

Display comments: newest first

1 / 5 (4) Jul 30, 2014
[W]e might want to consolidate the open evaluation process into a single system, which should ideally be publicly funded and entirely transparent.
LOL "publicly funded"? Means like the internet was built? Now being taken over by the NWO-OWG using tax-funding taken at the point of a gun.

Individual grassroots efforts are inefficient and anti-fragile against corporatist and government interference.
1 / 5 (3) Jul 30, 2014
Another observation; here we have da' convo blogging about 'impact' - what incredible recherché self-reference as it is "provided" to a news accumulator. See the meme; buzz, twitter, convo? Andy Warhol was prescient and his minutes have been inflated to seconds and microseconds and his program notes into 142 ASCII characters.
4.2 / 5 (5) Jul 30, 2014
I agree that having only a few peer reviewers is not optimal, but you have to consider some very hard realities:
- papers contain new research. No one else in the world has done that research. The number of people who work in a so closely related field to be able to judge this (and whose opinion/judgement would MEAN anything) are very few. The chance for irrelevant reviews would be large
- time constraint. If you wait for hundreds of people to read it/comment on it before you decide whether to read it yourself then you'll be here all decade.
- I can't see researchers having time to comment on every article they read
- Journal papers are peer reviewed anonymized for a good reason. Open review is not. The problems with this approach should be obvious.

In the end the proposed system still leads to scientists reading journal papers over others - albeit with a, possibly, better grading system that might help when filling posts (or it might not: with 'ballot stuffing').
2.3 / 5 (3) Jul 30, 2014
The impact factor was originally designed for librarians, as it should help them in choice of better journal for BUYING. At the moment, when the scientists adopted it as a criterion for PUBLISHING or even CITATIONS, they just turned its original meaning on its head. The positive feedback in application of criterion leads into its gradual instability and divergence.
3.7 / 5 (3) Jul 30, 2014
The more important thing is that access and control is being and has been sewed up.

The ownership of scientific journals and the access to the articles therein, is controlled by a few companies.

The question is, is this deliberate and well considered.... is using the story of it being a purely capitalistic concern -- is that a cover story for something entirely deeper?

Basically, the "public" (person in the street) is denied access to all these amazing research articles, articles that are hidden away from the very people who are the ones who bring about the REAL changes in the system, via left field discoveries and work. Work coming from those who might simply find/seek the journal reports - and think of something entirely new.

Discovery and human invention, creativity, blocked via heinous controls and paywalls. (is it merely the cover story? -investigate ownership...)

We cannot have insular science, built like secretive church dogma, and call it an open functional world.
1 / 5 (1) Jul 30, 2014
Paywalls are the commodification/commoditization of the commentariat.
not rated yet Jul 30, 2014

Is doing something along these lines but only with the health sciences.
4.6 / 5 (7) Jul 30, 2014
Basically, the "public" (person in the street) is denied access to all these amazing research articles

You are 'denied access' from scientific articles as much as you are denied access to a bag of potato chips.
Go buy the articles if you want to. You can buy them individually from the publishers at their websites. The 'blocking' isn't done by science but by those who make money off of publishing journals (in the exact same way that other companies make money off of publishing newspapers, magazines, comic books, etc.). So go ask them why they aren't altruist in a capitalist system.

If you want the articles for free go ask the authors (nicely). Or just move one friggin finger and go to arxiv - where you'll find most anything you want to read about even before it hits journals.

You're really griping about a non-problem.
5 / 5 (1) Jul 30, 2014
'Metrics' are here to stay.
The world is large and complex enough that inevitably important decisions must be made by people with no direct understanding of a situation and no vested interest in the outcomes of their decisions. Metrics are created for these people, because no matter how ignorant they are with respect to the subject matter, they understand basic concepts of arithmetic.
Metrics give them numbers with which they can distinguish, rank, and sort a multitude of objects they must deal with in their ignorance.
Real science builds on its own past successes, while science 'failures' are doomed by future experiments that do not support them.
Empirical validation/falsification is the REAL metric that rules science.
5 / 5 (3) Jul 30, 2014
The metric is used for other things (not among scientists to see who's got the bigges science-tonker. Go ask any scientists what their impact factor is. Unless they are actively trying to get tenure at this very moment the answer will be "I have no idea". Guaranteed.)

Such a metric is used by committees to fill empty posts. How else are they going to decide whether someone is a capable researcher (and also, most importantly, capable of communicating his research) than by using some metric that reflects past work in some way? Such committees are beholden to be impartial - so they have to justify their decision in post as objective (in front of the other applicants and any patrons of the institution)
Obviously the position is vacant - so there's no in-house expert to judge solely based on the research .

...and research alone not a good professor does make!
1.6 / 5 (7) Jul 30, 2014
go to arxiv - where you'll find most anything you want to read about even before it hits journals
Negative, the arxiv is filled mostly with dumb theoretical articles, the practically important research is still paywalled. And the accessibility of preprints is not a topic issue here.
1 / 5 (1) Jul 30, 2014
Scientists don't read articles--they inverse pyramid them. They scan thousands of titles, then subset hundreds of abstracts, of which a further subset have their introductions and conclusions read. Only a few are read in whole. Right at the bottom of the verse pyramid, authors are contacted and occasionally turn into collaborators.
1 / 5 (2) Jul 31, 2014
Scientists don't read articles--they inverse pyramid them
This is not relevant for impact of journals (and discussion subject) anyway - only the citations. It doesn't matter, how often and deeply the articles are read - only cited. Regarding my previous comment: for example from over 4.000 articles about cold fusion only one or two appeared at ArXiv: this subject is not covered with reviewed journals, therefore it essentially doesn't exist for preprint servers, despite its mostly experimental. With compare to it, solely ad-hoced string theory disproved with experiments has thousands of articles at ArXiv. The speaks about actual relevance of ArXiv to physics, which is experimental science.
1 / 5 (2) Jul 31, 2014
The only way to beat the impact factor is to provide a better evaluation signal for new scientific papers
Better ≠ faster. The problem is, the citation index is the main relevant signal of quality of scientific articles, despite it has its own problems. But the citations may get delayed by many years - and the scientists need evaluation of their work as soon as possible for asking for grants and subsidizes for another research. From this reason the quality proxy in form of journal impact is maintained: the effectiveness of scientific work in getting citations was replaced with effectiveness in publishing in high-impacted journals with all positive and negative consequences of it.
1 / 5 (2) Jul 31, 2014
almost every scientist relies on the metric (or the prestige it confers to a journal) when selecting what to read
This is indeed completely flawed approach. For example, if you're an Arduino developer and you just need some help with your project, then you're seeking desperately for all possible relevant sources at the web. When I want to find a relevant information about cold fusion or gravitational beams, then I don't take a sh*t about where these informations were presented. Every source of information is important for me and I want to judge its relevance myself. The scientists must learn to work like the Arduino developers seeking for help with their private research projects. The problem is, the character of research in contemporary science is adjusted in the way, they're not forced to do it, as they can survive quite easily with incremental shallow work, which brings nothing substantial to subject of their research. They just work like the workers in the huge factory.
1 / 5 (2) Jul 31, 2014
The intellectual laziness in processing of information manifests itself with replacement of independent analysis with blind reliance on impact index and authorities even here at physorg. Just the proponents of mainstream science are, who depend on voting systems most extensively here and who are lazy enough for to judge the posts by the karma of authors instead of actual content. Aren't just the scientists and proponents of scientific method those, who should excel in independent analytical approach? Aren't just the proponents of science these, who should ignore the voting system here in largest extent?
1 / 5 (1) Jul 31, 2014
The adherence of scientists on external authorities, which decide, what is good for their reading and what not has its roots in cost of private publishing and cost of access to information. For example, if all articles would be present at arxiv freely at the single pile with effective search and indexing system, then nobody would care, how accessible is the journal, where these informations are presented. But because the scientific journals are so expensive today, everyone must select very carefully the journal by its quality and relevance, which enforces the system based on blind belief in formal authority instead of actual content. Therefore the Open Access journals or even better fully opened repository is the way, how to remove the unhealthy dependence of scientists on the formal indicators of quality of information. After then we can discuss, how to maintain feedback inside of such free system.
1 / 5 (1) Jul 31, 2014
The public voting system apparently doesn't work even here at PhysOrg. From this reason I proposed to replace it with private black and white-lists of posters (sorta private spam filters like these ones used for filtering of emails and web content), which could be otherwise shared on demand. IMO the public repository of scientific articles could work on the same principles. It would inhibit the influence of voting spammers, but still provided the fully democratic feedback.
5 / 5 (2) Jul 31, 2014
the practically important research is still paywalled

Then pay the paywall. What's your problem? If you want something you pay for it. That's how it goes - always. And since science isn't solely funded by taxpayer money you don't even have that argument to tide you over.

Regarding my previous comment: for example from over 4.000 articles about cold fusion only one or two appeared at ArXiv:

Why would you expect snake-oil blurbs to be featured in a science archive? That's like saying: In all of National Geographic history there are only two articles on porn stars. Low, but not surprising.

Aren't just the proponents of science these, who should ignore the voting system here in largest extent?

They not only shpould: they do. No scientist will look up the impact rating of another scientist before conversing. None.

You should stop fantasizing and actually looking at reality or start talking to scientists. Your fantasy world doesn't exist.