January 16, 2020

Banking on a new community isotope database

by Aaron Dubrow, University of Texas at Austin

Stable isotopes act like fingerprints or fibers in forensics, capturing details of where someone or something lived, what it ate or breathed, and how its environment changed over time.

Isotopes are variants of elements whose nuclei contain the same number of protons but a different number of neutrons. Some isotopes are unstable and short-lived, but there are approximately 300 known naturally occurring stable isotopes, including common elements like hydrogen, carbon, nitrogen, and oxygen, that possess different numbers of neutrons based on how they formed.

Stable isotope analysis is used in a large number of fields, from archeology, where they are used to date objects, to conservation biology, where they help understand why a species is struggling.

"It's a useful interdisciplinary tool that researchers can apply to just about anything," said Seth Newsome, an animal ecologist at the University of New Mexico who uses isotopes to study the resources that sustain mammals, birds, and fish. "Tell me a problem and I'll find a way to use isotopes to solve it."

Tens of thousands of researchers like Newsome use isotope analysis in their work, and frequently the same dataset can be useful to many researchers, even those working in disparate fields. However, until recently, no central data repository existed.

"Before, the process was largely independent," said Jonathan Pauli, associate professor of forestry and wildlife ecology at the University of Wisconsin-Madison. "Researchers would generate data to use in papers and then store it on laptops or servers, siloed away. There's two problems with that: First, it's dangerous—decades of data are possibly at risk. And second, the data is unavailable. Data should be available to others to be able to build on. Progress is at the heart of science."

From Discussion to Action

Starting in 2015, a handful of researchers set about trying to change this. They recognized that research was limited by a lack of access to, or knowledge of, existing data. They also saw how the creation of GenBank in the 1980s had accelerated the rate of discovery in genomics and become one of the most valuable resources in science—deemed so critical that it is now funded directly by Congress.

"When people talk about the impact that technology has had on bioinformatics, they talk about genetic sequencing getting cheaper, but without the data becoming available through GenBank, the cost wouldn't make much difference," said Chris Jordan, manager for Data Management and Collections at the Texas Advanced Computing Center (TACC). "Unless you can compare your genome to all the other genomes that have been sequenced, you can't make any sense of it. There is the potential here to do something similar with stable isotope data."

Meetings at conferences eventually led to a position paper in BioScience, where they asserted "[n]ow is the time to invest in a parallel special-purpose database for another burgeoning field of research with enormous promise: the use of stable isotopes."

In 2017, the National Science Foundation (NSF) funded a workshop that brought together a team that included both researchers who use isotopes and the staff who run the field stations where isotopes are analyzed. The meeting led to an opinion piece in Proceedings of the National Academy of Sciences, titled "Why we need a centralized repository for isotopic data," that outlined a vision of a shared resource for researchers using isotopes.

"Data is being generated at a tremendous rate," said Pauli. "It's time for us to find a place to house all these data in a format that people can draw upon and ask bigger questions."

In 2018, NSF awarded a $1.5 million, three-year grant to the team of researchers and developers based at the University of Wisconsin-Madison, the University of New Mexico, the University of Utah, and TACC to develop a digital archive of data, and ultimately analytical tools, that could be used by researchers worldwide—an IsoBank.

The team held their first Principal Investigator's meeting at TACC in December 2018, spent 2019 consulting with the community through multidisciplinary workshops and developing metadata and data structures from the resulting feedback, and are now working with a group of initial data depositors to bring data into the system. The rich metadata resulting from this community process will provide crucial contextual information to enable rigorous re-use of stable isotope data retrieved from IsoBank.

Building an IsoBank for the Ages

The challenge in creating any large-scale repository is designing it so it can be maximally useful—organized, searchable, sustainable, and easy to access.

The long-term storage of the data associated with millions of samples is hard. However, much more challenging is identifying the metadata—the data that describes and gives information about each sample—that must accompany the raw data to make comparisons across time, space, and subject possible.

Metadata helps classify where a sample came from, what it relates to, and how it was analyzed. Understanding what metadata is needed is the first step towards developing high-quality organizational schemas and accessible databases.

"We're isotope experts, but we're not experts in designing, building, and curating such a large database like this," said Pauli. "TACC is really central in providing the expertise for us to be able to build IsoBank the way it should be."

In addition to providing the compute power to support thousands of researchers each year, TACC hosts critical datasets for a range of communities and develops gateways and portals to make the data accessible.

It was TACC's experience developing and supporting the Arctos database of more than three million records from natural and cultural history collections that led Pauli and his collaborators to select TACC as a partner. Other comparable community databases based at TACC are those of the Billie L. Turner Plant Resources Center, DesignSafe (a resource for the natural hazards research community), and SD2E (a web-based analysis platform for the DARPA Synergistic Discovery and Design project).

"TACC's role is to figure out how to express and store all of this metadata; how to enable search in a meaningful way; and how to start adding in useful analysis tools," Jordan said. "Being able to eventually enable new types of research so researchers can ask questions they've never asked before and get useful answers—that's the kind of thing that TACC is here to do."

The long-term nature of TACC's storage and computing infrastructure also assures the sustainability of the project.

"We don't want an ephemeral product. We want to build something that can be used for many years. Being part of the TACC facility will allow us to do so," Pauli explained.

Building a Gateway to Discovery

While Pauli and his colleagues spread the word of the IsoBank and recruit researchers from a variety of subfields to determine the metadata requirements, Jordan and the team at TACC are translating the community needs into a virtual framework meant to scale to many petabytes of data and last for decades.

"One of the things I always tell people about TACC is that it's not just big computers, it's about complex computing challenges," said Jordan. "Things that are difficult in a number of dimensions, that's what's going to need our expertise, as well as the ability to integrate interesting analysis and visualization tools that are developed in other contexts."

The goal is to develop a framework for computational analysis where researchers can bring together multiple datasets, including those created to answer very different questions, make comparisons and study changes across long distances or time-scales, and run statistical analyses.

For Newsome, whose personal research is in the area of biological conservation, this may mean combining isotope datasets to look at how isotopes of various types of organisms vary over space and time.

"The ability to see what's been done, or use pilot data to help make predictions or design projects—that will help push ecology and biology forward," he said.

Another benefit of central repositories is their ability to generate collaborations both within and between scientific fields.

"There isn't a field out there in natural science and medicine that's not interested in isotopes," Newsome said. "It'll be cool to see this lead to collaborations of truly distant fields, like pharmacology and ecology, and biochemistry and archeology."

With a team of domain specialists determining the requirements and gathering data, and TACC data management experts turning those requirements into useable technologies, the project has the potential to transform science.

"Isotopes are a powerful tool across a diversity of disciplines and will continue to give us important insights into how the world works," Pauli said.

More information: Jonathan N. Pauli et al, Opinion: Why we need a centralized repository for isotopic data, Proceedings of the National Academy of Sciences (2017). DOI: 10.1073/pnas.1701742114

Journal information: BioScience , Proceedings of the National Academy of Sciences

Provided by University of Texas at Austin

Citation: Banking on a new community isotope database (2020, January 16) retrieved 27 June 2024 from https://phys.org/news/2020-01-banking-isotope-database.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Tapis computing platform weaves together science computing tools

6 shares

Feedback to editors

Banking on a new community isotope database

From Discussion to Action

Building an IsoBank for the Ages

Building a Gateway to Discovery

Alphabet soup: NASA's GOLD mission finds surprising C, X shapes in atmosphere

Clean Water Act leaves about 55% of water flowing out of rivers vulnerable to pollution, study suggests

Most pristine trilobite fossils ever found shake up scientific understanding of the long extinct group

Predicting chronic wasting disease in counties could prevent spread

Research team develops surfaces designed to discourage spread of resistant bacteria

Gold nanoparticles kill cancer—but not as thought

Unlocking biodiversity insights from the tropical Andes

Printed sensors in soil could help farmers improve crop yields and save money

Pacific cod can't rely on coastal safe havens for protection during marine heat waves, study finds

Study projects loss of brown macroalgae and seagrasses with global environmental change

Relevant PhysicsForums posts

Who chooses official designations for individual dolphins, such as FB15, F153, F286?

Color Recognition: What we see vs animals with a larger color range

Innovative ideas and technologies to help folks with disabilities

Is meat broth really nutritious?

COVID Virus Lives Longer with Higher CO2 In the Air

Periodical Cicada Life Cycle

Tapis computing platform weaves together science computing tools

TACC Ranch technology upgrade improves valuable data storage

Frontera named 5th fastest supercomputer in the world

Bones of Roman Britons provide new clues to dietary deprivation

Researchers discover heaviest known calcium atom; eight new rare isotopes discovered in total

Research suggests life thrived on Earth 3.5 billion years ago

Predicting chronic wasting disease in counties could prevent spread

Unlocking biodiversity insights from the tropical Andes

Pacific cod can't rely on coastal safe havens for protection during marine heat waves, study finds

Study projects loss of brown macroalgae and seagrasses with global environmental change

Ecologists reconstruct history of biodiversity in Indo-Australian archipelago and its rise as a hotspot

Do vertebrate populations really decline so much? Calculations indicating severe declines might be wrong, says study

Medical Xpress

Tech Xplore

Science X

Banking on a new community isotope database

From Discussion to Action

Building an IsoBank for the Ages

Building a Gateway to Discovery

Alphabet soup: NASA's GOLD mission finds surprising C, X shapes in atmosphere

Clean Water Act leaves about 55% of water flowing out of rivers vulnerable to pollution, study suggests

Most pristine trilobite fossils ever found shake up scientific understanding of the long extinct group

Predicting chronic wasting disease in counties could prevent spread

Research team develops surfaces designed to discourage spread of resistant bacteria

Gold nanoparticles kill cancer—but not as thought

Unlocking biodiversity insights from the tropical Andes

Printed sensors in soil could help farmers improve crop yields and save money

Pacific cod can't rely on coastal safe havens for protection during marine heat waves, study finds

Study projects loss of brown macroalgae and seagrasses with global environmental change

Relevant PhysicsForums posts

Related Stories

Tapis computing platform weaves together science computing tools

TACC Ranch technology upgrade improves valuable data storage

Frontera named 5th fastest supercomputer in the world

Bones of Roman Britons provide new clues to dietary deprivation

Researchers discover heaviest known calcium atom; eight new rare isotopes discovered in total

Research suggests life thrived on Earth 3.5 billion years ago

Recommended for you

Predicting chronic wasting disease in counties could prevent spread

Unlocking biodiversity insights from the tropical Andes

Pacific cod can't rely on coastal safe havens for protection during marine heat waves, study finds

Study projects loss of brown macroalgae and seagrasses with global environmental change

Ecologists reconstruct history of biodiversity in Indo-Australian archipelago and its rise as a hotspot

Do vertebrate populations really decline so much? Calculations indicating severe declines might be wrong, says study

Newsletter sign up

Donate and enjoy an ad-free experience