Simple errors limit scientific scrutiny

November 11, 2015, Australian National University
Prof Loeske Kruuk. Credit: Stuart Hay, ANU

Researchers have found more than half of the public datasets provided with scientific papers are incomplete, which prevents reproducibility tests and follow-up studies.

However, slight improvements to research practices could make a big difference.

Lead researcher Dr Dominique Roche from ANU Research School of Biology said many peer-reviewed biological journals now require authors to publicly archive their data when a paper is published.

"Unfortunately, our study suggests that many public datasets may be unusable," Dr Roche said.

Making research data available improves the transparency and reproducibility of research results and avoids unnecessary duplication of data collection.

A survey of 100 papers published in leading journals in ecology and evolution found that more than 50 per cent of the datasets associated with these studies were incomplete due to missing data or essential information needed to interpret the data.

Dr Roche said that making the data public is extremely useful, but that the process is often compromised by simple errors made by researchers.

"Many scientists, including myself, lack proper training in public data archiving and open science practices. These are new practices for most researchers," he said.

"Biologists often deal with large and complex data-sets that require good organisational skills to present in ways that others can use them. The archived data-sets can be just as important as the published paper.

"Fortunately, many of the problems we encountered in our study can be fixed relatively quickly and easily."

The study, published in PLOS Biology, makes a number of suggestions such as providing basic but complete data descriptors, using standard file formats such as comma-separated values (csv) rather than pdfs or excel files, and archiving data-sets in an established, searchable online database, instead of as an appendix to the research paper.

Co-author Professor Loeske Kruuk from the ANU Research School of Biology said the paper recommended rewarding researchers that work transparently and collaboratively.

"Journals and databases don't have the resources to check whether archived data-sets are adequate," she said.

"The quality of the archived data-sets relies on researchers' goodwill."

Explore further: Scientists don't turn a blind eye to bias

More information: Dominique G. Roche et al. Public Data Archiving in Ecology and Evolution: How Well Are We Doing?, PLOS Biology (2015). DOI: 10.1371/journal.pbio.1002295

Related Stories

Data are lost to science at 'astonishing rate'

December 19, 2013

New evidence reported in the journal Current Biology on December 19 confirms long-held fears about the fate of scientific data. Careful evaluation of more than 500 randomly selected studies found that the original data behind ...

Peer review option proposed for biodiversity data

October 25, 2012

Data publishers should have the option of submitting their biodiversity datasets for peer review, according to a discussion paper commissioned by the Global Biodiversity Information Facility (GBIF).

Scientists who share data publicly receive more citations

October 1, 2013

A new study finds that papers with data shared in public gene expression archives received increased numbers of citations for at least five years. The large size of the study allowed the researchers to exclude confounding ...

Manuscript at the click of a button

October 13, 2015

Data collection and analysis are at the core of modern research, and often take months or even years during which researchers remain uncredited for their contribution. A new plugin to a workflow previously developed by the ...

Recommended for you

How other people affect our interpersonal space

May 24, 2018

Have you ever felt the urge to cross the road or move seats on a train after a conversation taking place nearby suddenly becomes aggressive? Well, for the first time a scientific study has shown how the size of your interpersonal ...

1 comment

Adjust slider to filter visible comments by rank

Display comments: newest first

not rated yet Nov 11, 2015
Quick and easy: you just need to take a course, spending no more than a month or two of your time, so that other people, competing with you for grants, tenure, and jobs can use your data. Just like doing your own check-out and bagging when shopping - no added load on you, right? Its a cost reduction, not a cost redistribution, right? You are PAID for the couple of minutes you spend operating the register, right? Or perhaps you just get credit discount on your bill? No? Wow, shocker! So, first data should be available, then data must be on-line, now this data has to be presented so that others can interpret it without referring to the author(s)? Sad. Who is going to police this system? Who is going to watch the watchers?

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.