DNA barcodes that reliably work: A game-changer for biomedical research

June 21, 2018, University of Texas at Austin
This illustration shows the most common structure of DNA found in a cell, called B-DNA. Credit: Richard Wheeler (Zephyris). Used under the Creative Commons Attribution-ShareAlike 3.0 license.

In the same way that barcodes on your groceries help stores know what's in your cart, DNA barcodes help biologists attach genetic labels to biological molecules to do their own tracking during research, including of how a cancerous tumor evolves, how organs develop or which drug candidates actually work. Unfortunately with current methods, many DNA barcodes have a reliability problem much worse than your corner grocer's. They contain errors about 10 percent of the time, making interpreting data tricky and limiting the kinds of experiments that can be reliably done.

Now researchers at The University of Texas at Austin have developed a new for correcting the errors that creep into DNA barcodes, yielding far more accurate results and paving the way for more ambitious medical research in the future.

The team—led by postdoctoral researcher John Hawkins, professor Bill Press and assistant professor Ilya Finkelstein—demonstrated that their new method lowers the error rate in barcodes from 10 percent to 0.5 percent, while working extremely rapidly. They describe their method, called FREE (filled/truncated right end edit) barcodes, today in the journal Proceedings of the National Academy of Sciences.

The researchers have applied for a patent and are making the method freely available for academic and noncommercial use.

With DNA barcodes, scientists can study how a cancerous tumor evolves, not just as a whole, but as a large collection of individual cells that evolve differently to reveal which cells are vulnerable to therapeutics and which aren't. Scientists interested in growing replacement organs for injured or sick people can use DNA barcodes to better understand how organs naturally develop. And researchers looking to screen millions of potential drugs to find one that binds to a certain molecule, and thus has the potential to treat a disease, can use DNA barcodes to find the proverbial needle in a haystack.

A key step in the new method for correcting errors in DNA barcodes involves creating "decode spheres" around each potential barcode and then solving a "sphere packing" problem to maximize the number of usable barcodes. Credit: University of Texas at Austin
"DNA barcodes are a part of a great deal of cutting-edge research in medicine and drug development, and to be able to improve the accuracy and efficiency of so many of these is very exciting," said Hawkins. "And maybe even more exciting is that now with these better barcodes, this allows us to have larger, more ambitious experiments that weren't possible before."

A DNA contains a short string of letters that equates to a unique code, using the four letters found in DNA: A, C, G and T. These barcodes are stuck onto molecules, such as cellular proteins or drug candidates, as a way of keeping track of where they all go, sometimes by the millions, and how they interact with other molecules. About one-tenth of the time, however, errors occur—such as one letter being replaced by the wrong letter, an extra letter being inserted, or a letter being deleted—potentially skewing the results of critical biomedical research.

One of the keys to this new error-correction method is to select just the right barcodes from the beginning. This method involves choosing a string of letters for each barcode such that even if a small error creeps in—say, a G is substituted for a C—it will still be more like the intended barcode than any other. The method requires throwing out many possible strings of letters, but the researchers minimized this loss by borrowing an approach from computer science called sphere packing.

"My contribution has been designing a way to find those barcodes such that even if there is an error in it, you know which original barcode it came from," Hawkins said.

Alternative error-correcting methods for DNA barcodes, such as what are known as Levenshtein codes, require throwing away up to 100 times as many barcodes as with the FREE method, and they are up to 1,000 times slower to decode the results. As a result, whereas existing technology made projects with hundreds of millions of barcodes nearly impossible, the new technology allows for rapid, accurate results.

Explore further: More tricks with next-generation DNA sequencing: DNA barcodes gone wild

More information: John A. Hawkins et al. Indel-correcting DNA barcodes for high-throughput sequencing, Proceedings of the National Academy of Sciences (2018). DOI: 10.1073/pnas.1802640115

Related Stories

Scientists engineer novel DNA barcode

September 24, 2012

Much like the checkout clerk uses a machine that scans the barcodes on packages to identify what customers bought at the store, scientists use powerful microscopes and their own kinds of barcodes to help them identify various ...

Taking CRISPR from clipping scissors to word processor

May 7, 2018

Using the gene-editing tool CRISPR to snip at DNA is often akin to using scissors to edit a newspaper article. You can cut out words, but it's difficult to remove individual letters or instantly know how the cuts affect the ...

Using barcodes to trace cell development

August 16, 2017

How do the multiple different cell types in the blood develop? Scientists have been pursuing this question for a long time. According to the classical model, different developmental lines branch out like in a tree. The tree ...

Concerns raised about airline boarding pass barcodes

October 26, 2012

(Phys.org)—Boarding passes for travel on airlines in the US (and many other countries) now include barcodes, but an aviation security researcher has now learned that these barcodes can be read by readily available tools ...

Recommended for you

Meteorite source in asteroid belt not a single debris field

February 17, 2019

A new study published online in Meteoritics and Planetary Science finds that our most common meteorites, those known as L chondrites, come from at least two different debris fields in the asteroid belt. The belt contains ...

Diagnosing 'art acne' in Georgia O'Keeffe's paintings

February 17, 2019

Even Georgia O'Keeffe noticed the pin-sized blisters bubbling on the surface of her paintings. For decades, conservationists and scholars assumed these tiny protrusions were grains of sand, kicked up from the New Mexico desert ...

Archaeologists discover Incan tomb in Peru

February 16, 2019

Peruvian archaeologists discovered an Incan tomb in the north of the country where an elite member of the pre-Columbian empire was buried, one of the investigators announced Friday.

Where is the universe hiding its missing mass?

February 15, 2019

Astronomers have spent decades looking for something that sounds like it would be hard to miss: about a third of the "normal" matter in the Universe. New results from NASA's Chandra X-ray Observatory may have helped them ...

What rising seas mean for local economies

February 15, 2019

Impacts from climate change are not always easy to see. But for many local businesses in coastal communities across the United States, the evidence is right outside their doors—or in their parking lots.


Adjust slider to filter visible comments by rank

Display comments: newest first

1 / 5 (1) Jun 21, 2018
Just Transfer some properties of this to other Parrots !
Scott Wolfenden
3 / 5 (2) Jun 21, 2018
With this type of research I'm interested in knowing how widely accepted it is and how likely it is to become part of the mainstream research in genetically induced diseases. How long does it take for this methodology to filter into the bulk of research that is related to it. Anyone have experience in this department?
1 / 5 (3) Jun 21, 2018
Scott, the question is 'how do we monetize this'? How do we convince investors to put funding into this? How do we allocate the resources for building, distributing, maintaining, developing future replacements? How do we convince potential customers that our process is better than our competitors? How do we convince potential customers to divert from the investments they have already committed to other processes and projects?

Welcome to the swamp. Where l a better mousetrap may not keep the 'gators from chewing on your butt!
5 / 5 (3) Jun 25, 2018
With this type of research I'm interested in knowing how widely accepted it is and how likely it is to become part of the mainstream research in genetically induced diseases. How long does it take for this methodology to filter into the bulk of research that is related to it. Anyone have experience in this department?

Just as with every other type of research, they will need to provide best evidence with a long string of successes before it is accepted into the mainstream, Rome wasn't built in a day.

As far as I can tell, these researchers appear to be dedicated to their work. I suggest that you may want to sign up for Alerts at www.pnas.org. The TOC (table of contents) is a good choice if you would like to be alerted to new and innovative research by the same group. The link is at the bottom of the above article.
1 / 5 (1) Jul 02, 2018
@Scott: If they have implemented a free software (they did), chances are that other groups are already trying this out, especially if they got tipped before the publication. Cutting errors with an order or magnitude is impressive, and here it means new applications and/or cheaper projects become available.

@mwillsj: Really? A strawman instead of responding to Scott in good faith!?

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.