Computational model links family members using genealogical and law-enforcement databases

October 11, 2018, Cell Press
Credit: CC0 Public Domain

The notion of using genetic ancestry databases to solve crimes recently crossed from hypothetical into credible when police used an online genealogical database to track down the alleged Golden State Killer, a serial criminal who terrorized much of California in the 1970s and 1980s. Now, in a study published October 11 in Cell, researchers are reporting ways in which that type of inquiry could potentially be expanded.

Specifically, they have published a computational method for linking individuals in ancestry databases to those in law-enforcement databases. These two databases use completely different systems of . The investigators report in a proof of principle with 872 people that for close relatives—either sibling or parent-offspring pairs—more than 30% can be accurately matched with the correct relative using nonoverlapping genetic markers from the two different databases.

"There's a legacy problem in that so many DNA profiles have been collected with this older genetic marker system that's been used by law enforcement since the 1990s. The system is not designed for the more challenging queries that are currently of interest, such as identifying people represented in a DNA mixture or identifying relatives of the contributor of a DNA sample," says senior author Noah Rosenberg, a biology professor at Stanford University. "In this study, we were trying to pose the question of whether a newer, more modern system of genetic markers could be tested against the old system and still get matches and find relatives."

The used by the FBI and other law-enforcement agencies is known as the Combined DNA Index System (CODIS). It relies on short tandem repeat (STR) markers, a type of copy-number variation, in noncoding regions of the DNA. (The system originally used 13 markers; it recently was updated and now includes 20.) By contrast, ancestry databases look for differences in single-nucleotide polymorphisms (SNPs) across hundreds of thousands of sites in the genome.

In a study published last year, Rosenberg's team reported that software could match individuals who appeared in both databases even with genotype datasets that had no shared markers. They matched more than 90% of people using the 13-marker version of CODIS and up to 99% with 20 markers. The key idea is that each STR marker is surrounded by SNPs that are typically inherited together with the STR. As a result, a person's genotypes for those SNPs can partially predict the genotype of the neighboring STR and vice versa. When these subtle correlations are accumulated across many STRs, it becomes possible to match an SNP profile with an STR profile.

The new paper built on that research by looking at whether the same approach would work in connecting close family members. They found that when one individual had been analyzed for STR markers and the other for SNP markers, about 30%-32% of parent-offspring pairs and 35%-36% of sibling pairs could be linked.

In the Golden State Killer case, law enforcement submitted DNA collected from one of the crime scenes for SNP genotyping, then used an open-source ancestry database to link that profile with other individuals who were present in the database. But the technique reported in the new paper suggests that familial searches might be possible to perform linking people in CODIS to relatives in an ancestry database or vice versa.

The study was intended to provide data for discussing many of the issues surrounding forensic genetics and genomic privacy, Rosenberg explains. "We wanted to examine to what extent these different types of databases can communicate with each other," he says. "It's important for the public to be aware that information between these two types of genetic data can be connected, often in unexpected ways."

When current policies surrounding DNA evidence were established, it wasn't possible to make this connection. "We have shown that the investigative reach of forensic STR profiles might be possible to expand beyond what was previously believed to be the limit," he adds.

In the paper, the researchers note other policy-relevant issues surrounding this expanded capability. For example, certain populations are overrepresented in law-enforcement STR databases. Expanding the use of database searches could change the calculation about who is accessible to investigators from the profiles in those databases. "There has already been a lot of legal analysis on how STR databases are used," Rosenberg says. "With this study, we suggest that SNP databases and their links to STR databases should also be considered in that analysis."

The new findings have applications for other areas of study beyond . For example, ecologists studying organisms in the field could use this approach to determine whether animals living in a particular geographic site descended from animals whose DNA had been collected on a previous sampling trip even if only STR data is available from the older samples. The linkage tools also could potentially be used to link DNA fragments from ancient humans with each other—for example, when multiple samples are tested from an ancient burial site.

Explore further: Should the police be allowed to use genetic information in public databases to track down criminals?

More information: Cell, Kim et al. "Statistical detection of relatives typed with disjoint forensic and biomedical loci." http://www.cell.com/cell/fulltext/S0092-8674(18)31180-2 , DOI: 10.1016/j.cell.2018.09.008

Related Stories

Gene matches could aid science, but raise privacy concerns

May 16, 2017

How much could one really figure out about a person from 13 tiny snippets of DNA? At first glance, not much – in the world of genetics, 13 is tiny. But a new study suggests it may be enough to infer hundreds of thousands ...

Recommended for you

Loss of a microRNA molecule boosts rice production

October 16, 2018

The wild rice consumed by our Neolithic ancestors was very different from the domesticated rice eaten today. Although it is unclear when humans first started farming rice, the oldest paddy fields—in the lower Yangzi River ...

Big Agriculture eyeing genetic tool for pest control

October 16, 2018

A controversial and unproven gene-editing technology touted as a silver bullet against malaria-bearing mosquitos could wind up being deployed first in commercial agriculture, according to experts and an NGO report published ...

A selfish gene makes mice into migrants

October 16, 2018

House mice carrying a specific selfish supergene move from one population to another much more frequently than their peers. This finding from a University of Zurich study shows for the first time that a gene of this type ...

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.