October 3, 2012

Iowa State researchers developing 'BIGDATA' toolbox to help genome researchers

Today's life scientists are producing genomes galore.

But there's a problem: The latest DNA sequencing instruments are burying researchers in trillions of bytes of data and overwhelming existing tools in biological computing. It doesn't help that there's a variety of sequencing instruments feeding a diverse set of applications.

Iowa State University's Srinivas Aluru is leading a research team that's developing a set of solutions using high performance computing. The researchers want to develop core techniques, parallel algorithms and software libraries to help researchers adapt parallel computing techniques to high-throughput DNA sequencing, the next generation of sequencing technologies.

Those technologies are now ubiquitous, "enabling single investigators with limited budgets to carry out what could only be accomplished by an international network of major sequencing centers just a decade ago," said Aluru, the Ross Martin Mehl and Marylyne Munas Mehl Professor of Computer Engineering at Iowa State.

"Seven years ago we were able to sequence DNA one fragment at a time," he said. "Now researchers can read up to 6 billion DNA sequences in one experiment.

"How do we address these big data issues?"

A three-year, $2 million grant from the BIGDATA program of the National Science Foundation and the National Institutes of Health will support the search for a solution by Aluru and researchers from Iowa State, Stanford University, Virginia Tech and the University of Michigan. In addition to Aluru, the project's leaders at Iowa State are Patrick Schnable, Iowa State's Baker Professor of Agronomy and director of the centers for Plant Genomics and Carbon Capturing Crops, and Jaroslaw Zola, a former research assistant professor in electrical and computer engineering who recently moved to Rutgers University.

The majority of the grant – $1.3 million – will support research at Iowa State. And Aluru is quick to say that none of the grant will support hardware development.

Researchers will start by identifying a large set of building blocks frequently used in genomic studies. They'll develop the parallel algorithms and high performance implementations needed to do the necessary data analysis. And they'll wrap all of those technologies in software libraries researchers can access for help. On top of all that, they'll design a domain specific language that automatically generates computing codes for researchers.

Aluru said that should be much more effective than asking high performance computing specialists to develop parallel approaches to each and every application.

"The goal is to empower the broader community to benefit from clever parallel algorithms, highly tuned implementations and specialized high performance computing hardware, without requiring expertise in any of these," says a summary of the research project.

Aluru said the resulting software libraries will be fully open-sourced. Researchers will be free to use the libraries while developing, editing and modifying them as needed.

"We're hoping this approach can be the most cost-effective and fastest way to gain adoption in the research community," Aluru said. "We want to get everybody up to speed using high performance computing."

Provided by Iowa State University

Citation: Iowa State researchers developing 'BIGDATA' toolbox to help genome researchers (2012, October 3) retrieved 26 April 2024 from https://phys.org/news/2012-10-iowa-state-bigdata-toolbox-genome.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Cystorm supercomputer unleashes 28.16 trillion calculations per second

0 shares

Feedback to editors

More efficient molecular motor widens potential applications

15 minutes ago

Managing meandering waterways in a changing world

13 hours ago

New dataset sheds light on relationship of far-red sun-induced chlorophyll fluorescence to canopy-level photosynthesis

13 hours ago

How much trust do people have in different types of scientists?

15 hours ago

Scientists say voluntary corporate emissions targets not enough to create real climate action

15 hours ago

Barley plants fine-tune their root microbial communities through sugary secretions

15 hours ago

A shortcut for drug discovery: Novel method predicts on a large scale how small molecules interact with proteins

15 hours ago

Yeast study offers possible answer to why some species are generalists and others specialists

15 hours ago

Cichlid fishes' curiosity promotes biodiversity: How exploratory behavior aids in ecological adaptation

15 hours ago

Climate change could become the main driver of biodiversity decline by mid-century, analysis suggests

15 hours ago

Load comments (0)

Iowa State researchers developing 'BIGDATA' toolbox to help genome researchers

More efficient molecular motor widens potential applications

Managing meandering waterways in a changing world

New dataset sheds light on relationship of far-red sun-induced chlorophyll fluorescence to canopy-level photosynthesis

How much trust do people have in different types of scientists?

Scientists say voluntary corporate emissions targets not enough to create real climate action

Barley plants fine-tune their root microbial communities through sugary secretions

A shortcut for drug discovery: Novel method predicts on a large scale how small molecules interact with proteins

Yeast study offers possible answer to why some species are generalists and others specialists

Cichlid fishes' curiosity promotes biodiversity: How exploratory behavior aids in ecological adaptation

Climate change could become the main driver of biodiversity decline by mid-century, analysis suggests

Relevant PhysicsForums posts

Passing variables in FORTRAN

My Website For Creating Interactive Visuals Linked To Equations

Number of Multiplications in the FFT Algorithm

Error logging in: onLoginSuccess is not a function

Latest Notable AI accomplishments

Building a homemade Long Short Term Memory with FSMs

Cystorm supercomputer unleashes 28.16 trillion calculations per second

Computer science researchers explore virtualization potential for high-end computing

Petascale computing tools could provide deeper insight into genomic evolution

Iowa State, Ames Lab researchers preparing for Blue Waters supercomputer

Information technology needs fundamental shift to continue rapid advances in computing

Customizing supercomputers from the ground up

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Medical Xpress

Tech Xplore

Science X

Iowa State researchers developing 'BIGDATA' toolbox to help genome researchers

More efficient molecular motor widens potential applications

Managing meandering waterways in a changing world

New dataset sheds light on relationship of far-red sun-induced chlorophyll fluorescence to canopy-level photosynthesis

How much trust do people have in different types of scientists?

Scientists say voluntary corporate emissions targets not enough to create real climate action

Barley plants fine-tune their root microbial communities through sugary secretions

A shortcut for drug discovery: Novel method predicts on a large scale how small molecules interact with proteins

Yeast study offers possible answer to why some species are generalists and others specialists

Cichlid fishes' curiosity promotes biodiversity: How exploratory behavior aids in ecological adaptation

Climate change could become the main driver of biodiversity decline by mid-century, analysis suggests

Relevant PhysicsForums posts

Related Stories

Cystorm supercomputer unleashes 28.16 trillion calculations per second

Computer science researchers explore virtualization potential for high-end computing

Petascale computing tools could provide deeper insight into genomic evolution

Iowa State, Ames Lab researchers preparing for Blue Waters supercomputer

Information technology needs fundamental shift to continue rapid advances in computing

Customizing supercomputers from the ground up

Recommended for you

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Newsletter sign up

Donate and enjoy an ad-free experience