Math technique de-clutters cancer-cell data, revealing tumor evolution, treatment leads

Jun 06, 2013

In our daily lives, clutter is something that gets in our way, something that makes it harder for us to accomplish things. For doctors and scientists trying to parse mountains of raw biological data, clutter is more than a nuisance; it can stand in the way of figuring out how best to treat someone who is very sick.

Using increasingly cheap and rapid methods to read the billions of "letters" that comprise human genomes – including the genomes of individual cells sampled from —scientists are generating far more data than they can easily interpret.

Today, two scientists from Cold Spring Harbor Laboratory (CSHL) publish a of simplifying and interpreting genome data bearing evidence of mutations, such as those that characterize specific cancers. Not only is the technique highly accurate; it has immediate utility in efforts to parse , in order to determine a patient's prognosis and the best approach to treatment.

CSHL Assistant Professor Alexander Krasnitz, who developed the new technique jointly with Professor Michael Wigler, explains that it reduces the burden of interpretation by identifying what he and Wigler call COREs, an acronym for "cores of recurrent events."

krasznitz_diagram2013 When from 100 cells sampled from a single human tumor is analyzed, and the devised by Krasnitz and Wigler is applied, the rich structure of the data emerges. This is a "heat map" in which each horizontal row contains data from 1 of the 100 sampled cells; and each vertical column contains information about the presence (black) or absence (no mark) of a "CORE." Each core represents a place in the genome of a particular cell that either has amplified DNA (blue bar, top) or deleted DNA (red bar, top). From the mass of data underlying these phenomena, signatures of 4 subpopulations of tumor cells now become visible. The four groups and their evolutionary relation is shown along the left vertical axis: about half are "green," and are normal; the red group—consisting of only 4 cells of the 100, turns out, genetically, to be the most mutated and dangerous subgroup in this tumor.

Consider the example of a cancerous breast tumor. Central to the CORE concept is what Krasnitz and Wigler refer to as "intervals." An example of an interval would be a segment of DNA that is missing in the genetic sequence of one or more cells sampled from the tumor. Tumor cells are often missing DNA that should normally be present; or conversely, they often have genome intervals in which the normal DNA sequence is amplified – it appears in multiple copies. Such deletions and amplifications are called copy-number variations, or CNVs.

"In cancer," says Krasnitz, "we find intervals in the genome that are hit again and again. You might see this in many cells coming from a single patient's tumor; or you may see these repeating patterns in cells sampled from many patients with a similar cancer type."

In either case, if you superimpose the location of each "hit" – whether a deletion or an amplification of DNA—against a map of the full human genome, "you end up with these wobbly pile-ups, stacks of 'hits' at the same locations in the genome."

Due to the vagaries of collecting and a certain amount of small-scale variation in the precise boundaries of the deleted or amplified DNA intervals, the stacks don't line up straight; as Krasnitz says, they look "wobbly." This makes them very hard to accurately interpret.

The CORE method he and Wigler describe in a paper appearing in Proceedings of the National Academy of Sciences "is a mathematical way of cleaning up this mess and untangling these stacks of data, which often overlap." When data from 100 cells from a single tumor are analyzed, for example, and the mathematical algorithm devised by Krasnitz and Wigler is applied, the regularity of the stacks is revealed, and the rich structure of the data emerges.

In the example of analyzing 100 cells from one , the net result is that populations and subpopulations of cancer cells can be distinguished; and if the cancer has already become metastatic, CORE will be useful in discerning the relations among cancer cell subpopulations in various parts of the body. Such analysis is a potentially valuable guide to prognosis and can also help to make important treatment decisions.

Explore further: Unintended consequences: More high school math, science linked to more dropouts

More information: "Target inference from collections of genomic intervals" appears online today ahead of print in Proceedings of the National Academy of Sciences.

add to favorites email to friend print save as pdf

Related Stories

New method for sequencing genome in a single cell

Dec 21, 2012

(Phys.org)—The traditional genome sequencing process requires thousands of cells (or more) to provide sufficient DNA, and this means that variations that are only present in a small number of cells―such as early cancer ...

Wip1 could be new target for cancer treatment

May 06, 2013

Researchers have uncovered mutations in the phosphatase Wip1 that enable cancer cells to foil the tumor suppressor p53, according to a study in The Journal of Cell Biology. The results could provide a new ...

Recommended for you

Soccer's key role in helping migrants to adjust

16 hours ago

New research from the University of Adelaide has for the first time detailed the important role the sport of soccer has played in helping migrants to adjust to their new lives in Australia.

How dinosaurs shrank, survived and evolved into birds

18 hours ago

That starling at your birdfeeder? It is a dinosaur. The chicken on your dinner plate? Also a dinosaur. That mangy seagull scavenging for chips on the beach? Apart from being disgusting, yet again it is a ...

User comments : 0