Audit finds biodiversity data aggregators 'lose and confuse' data

April 23, 2018, Pensoft Publishers
A snippet of the results from a data processing event. Credit: Dr. Robert Mesibov

In an effort to improve the quality of biodiversity records, the Atlas of Living Australia (ALA) and the Global Biodiversity Information Facility (GBIF) use automated data processing to check individual data items. The records are provided to the ALA and GBIF by museums, herbaria and other biodiversity data sources.

However, an independent analysis of such records reports that ALA and GBIF data processing also leads to data loss and unjustified changes in scientific names.

The study was carried out by Dr Robert Mesibov, an Australian millipede specialist who also works as a data auditor. Dr Mesibov checked around 800,000 records retrieved from the Australian Museum, Museums Victoria and the New Zealand Arthropod Collection. His results are published in the open access journal ZooKeys, and also archived in a public data repository.

"I was mainly interested in changes made by the aggregators to the genus and species names in the records," said Dr Mesibov.

"I found that names in up to 1 in 5 records were changed, often because the aggregator couldn't find the name in the look-up table it used."

Another worrying result concerned type specimens - the reference specimens upon which scientific names are based. On a number of occasions, the aggregators were found to have replaced the name of a type specimen with a name tied to an entirely different type specimen.

The biggest surprise, according to Dr Mesibov, was the major disagreement on names between aggregators.

"There was very little agreement," he explained. "One aggregator would change a name and the other wouldn't, or would change it in a different way."

Furthermore, dates, names and locality information were sometimes lost from records, mainly due to programming errors in the software used by aggregators to check data items. In some data fields the loss reached 100%, with no original data items surviving the processing.

"The lesson from this audit is that biodiversity data aggregation isn't harmless," said Dr Mesibov. "It can lose and confuse perfectly good data."

"Users of aggregated data should always download both original and processed data items, and should check for data loss or modification, and for replacement of names," he concluded.

Explore further: Online biodiversity databases audited: 'Improvement needed'

More information: Robert Mesibov, An audit of some processing effects in aggregated occurrence records, ZooKeys (2018). DOI: 10.3897/zookeys.751.24791

Related Stories

Online biodiversity databases audited: 'Improvement needed'

April 22, 2013

The records checked were for native Australian millipede species and were published online by the Global Biodiversity Information Facility, GBIF and the Atlas of Living Australia, ALA. GBIF and ALA obtain most of their records ...

Effective new biodiversity data access portal

July 2, 2007

A new internet tool (http://data.gbif.org) was launched today by the Global Biodiversity Information Facility (GBIF). The launch event took place at an international meeting for scientific and technical advice to the Parties ...

Recommended for you

Nanoscale Lamb wave-driven motors in nonliquid environments

March 19, 2019

Light driven movement is challenging in nonliquid environments as micro-sized objects can experience strong dry adhesion to contact surfaces and resist movement. In a recent study, Jinsheng Lu and co-workers at the College ...

OSIRIS-REx reveals asteroid Bennu has big surprises

March 19, 2019

A NASA spacecraft that will return a sample of a near-Earth asteroid named Bennu to Earth in 2023 made the first-ever close-up observations of particle plumes erupting from an asteroid's surface. Bennu also revealed itself ...

The powerful meteor that no one saw (except satellites)

March 19, 2019

At precisely 11:48 am on December 18, 2018, a large space rock heading straight for Earth at a speed of 19 miles per second exploded into a vast ball of fire as it entered the atmosphere, 15.9 miles above the Bering Sea.

Levitating objects with light

March 19, 2019

Researchers at Caltech have designed a way to levitate and propel objects using only light, by creating specific nanoscale patterning on the objects' surfaces.

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.