October 6, 2015

Data integration or die: The importance of biologist input in efficiently sharing data

by The Genome Analysis Centre

Vicky Schneider, 361° Division at The Genome Analysis Centre, along with UK and European partners, has reviewed key aspects of standards and formats of biological data to highlight the importance of data integration and management tools for biologists.

Data format structural standards are critical to the intrinsic value of analyses, with regard to retrieval, sharing, validation, reproducibility, and particularly, integration and interpretation.

Integrating data is imperative for the advancement of research; blending results of diverse disciplines is often an essential step in answering meaningful biological questions. To achieve this, standards should be implemented at the source of the data for the sake of efficiency, particularly since the datasets are constantly increasing in size, and it may be almost impossible to achieve unification further downstream.

In order to engage the biologist community, the aim of the scientific paper is to familiarise experimental biologists with definitions and terms used by computational biologists, to foster cooperation towards cohesive data flow pipelines. Four main classes of data format are identified, (tables, FASTA, Genbank and tag-structured), a major step in defining how the multitude might be curated.

Data integration in biological research is centred on standards adoption promising easier conversion between data/file formats. The scale and infrastructure of a given database determine whether it should be stored in a centralised or distributed manner, with a trade-off against the difficulty of updating or querying, respectively. Either way, when the data needs to be (further) integrated (with other data), the computational burden of unifying formats should be eased wherever possible.

Ideally biologists should work with bioinformaticians and computer scientists to get more involved with standardising their data structures, reducing the ongoing issue of database management and programming tools to parse data. This will boost biological research, gaining a more robust structure for data analysis.

Senior Author, Dr Vicky Schneider, Head of the 361° Division at TGAC, said: "Data integration should not just rely on software engineers and computational scientists, but needs to be driven by the actual users whose communities need to define, adopt and use standards, ontologies and annotation best practice. Therefore, it is particularly important for the biological research community to get acquainted with the conceptual basis of data integration, its limitations, challenges and terminology."

Senior Author, Dr Allegra Via, Assistant Professor in the Biocomputing Group of Sapienza, University of Rome, added: "The importance of biologists in data integration is huge. They are those who produce and analyse data, which need to be shared for a better science. There cannot be data sharing without good practice in data integration."

The paper, titled: "Data Integration in Biological Research: An overview" is published in PubMed. The publication is a collaborative effort between TGAC, Department of Informatics at Ionian University, the ELIXIR Hub and Biocomputing Group, Sapienza University.

Provided by The Genome Analysis Centre

Citation: Data integration or die: The importance of biologist input in efficiently sharing data (2015, October 6) retrieved 2 July 2024 from https://phys.org/news/2015-10-die-importance-biologist-efficiently.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Biologists identify ways to enhance complex data integration across research domains

17 shares

Feedback to editors

Data integration or die: The importance of biologist input in efficiently sharing data

Two new species of Psilocybe mushrooms discovered in southern Africa

UV radiation damage leads to ribosome roadblocks, causing early skin cell death

Dual-laser approach could lower cost of high-resolution 3D printing

Novel method enhances size-controlled production of luminescent quantum dots

Cosmic simulation reveals how black holes grow and evolve

How climate change is affecting where species live

Human presence shifts balance between leopards and hyenas in East Africa

Physicists' laser experiment excites atom's nucleus, may enable new type of atomic clock

Treatment with a mixture of antimicrobial peptides found to impede antibiotic resistance

Study reveals fireworks' impact on air quality

Relevant PhysicsForums posts

Who chooses official designations for individual dolphins, such as FB15, F153, F286?

Color Recognition: What we see vs animals with a larger color range

Innovative ideas and technologies to help folks with disabilities

Is meat broth really nutritious?

COVID Virus Lives Longer with Higher CO2 In the Air

Periodical Cicada Life Cycle

Biologists identify ways to enhance complex data integration across research domains

Pan-European Species-directories Infrastructure: Basis for handling big taxonomic data

Changing the biological data visualization world

New study reveals improved way to interpret high-throughput biological data

Furthering data analysis of next-generation sequencing to facilitate research

ASA issues statement on role of statistics in data science

Two new species of Psilocybe mushrooms discovered in southern Africa

Researchers propose a new, holistic way to teach synthetic biology

Unlocking biodiversity insights from the tropical Andes

Biomechanics of sound production in high-pitched classical singing

'Sour Patch' adults: 1 in 8 grown-ups love extreme tartness, study shows

Linking environmental influences, genetic research to address concerns of genetic determinism of human behavior

Medical Xpress

Tech Xplore

Science X

Data integration or die: The importance of biologist input in efficiently sharing data

Two new species of Psilocybe mushrooms discovered in southern Africa

UV radiation damage leads to ribosome roadblocks, causing early skin cell death

Dual-laser approach could lower cost of high-resolution 3D printing

Novel method enhances size-controlled production of luminescent quantum dots

Cosmic simulation reveals how black holes grow and evolve

How climate change is affecting where species live

Human presence shifts balance between leopards and hyenas in East Africa

Physicists' laser experiment excites atom's nucleus, may enable new type of atomic clock

Treatment with a mixture of antimicrobial peptides found to impede antibiotic resistance

Study reveals fireworks' impact on air quality

Relevant PhysicsForums posts

Related Stories

Biologists identify ways to enhance complex data integration across research domains

Pan-European Species-directories Infrastructure: Basis for handling big taxonomic data

Changing the biological data visualization world

New study reveals improved way to interpret high-throughput biological data

Furthering data analysis of next-generation sequencing to facilitate research

ASA issues statement on role of statistics in data science

Recommended for you

Two new species of Psilocybe mushrooms discovered in southern Africa

Researchers propose a new, holistic way to teach synthetic biology

Unlocking biodiversity insights from the tropical Andes

Biomechanics of sound production in high-pitched classical singing

'Sour Patch' adults: 1 in 8 grown-ups love extreme tartness, study shows

Linking environmental influences, genetic research to address concerns of genetic determinism of human behavior

Newsletter sign up

Donate and enjoy an ad-free experience