How cops used a public genealogy database in the Golden State Killer case
DNA was credited for cracking the decades old cold case of the "Golden State Killer," a California serial murderer and rapist. But the detectives used a public database of genetic genealogy called GEDmatch, raising privacy concerns about publicly available DNA profiles.
Detectives working on the case created a fake profile and uploaded a real DNA sample. Matches from distant family members led Sacramento police to the door of the suspect, Joseph James DeAngelo.
The case has created a wave of concern about the privacy of direct-to-consumer DNA testing, mostly carried out by the big genealogy companies like Ancestry.com and 23andMe. Representatives from both companies were quick to defend their policies of not giving information to the police.
But this isn't the first time genealogy information has been used to solve a crime. In a 2015 case the matches were turned over by Ancestry.com under a search warrant. This time, the police just helped themselves.
Genetic genealogy has generated some of the largest and most useful datasets in the world with little discussion of privacy, particularly around the question of who other than genealogists might access these databases and for what reason. I've been researching these issues for over a decade and have made a documentary "Data Mining the Deceased: Ancestry and the Business of Family" and am just finishing a book that expands the histories of some of the biggest databases in the world.
When you submit your DNA to a public database or a direct-to-consumer genetic genealogy company, you are also submitting information about all of your closest relatives, living and dead. The point of these tests is to discover relatives or, more recently, your percentage of ethnic or racial inheritance.
But the secondary uses of the information —as in the case of the Golden State Killer—has seen little discussion in the face of rapidly increasing sales of ancestral DNA tests. There is a general sense that the information is completely benign.
Public sites like GEDmatch are a boon because they have fewer privacy restrictions than commerical sites. In the wake of public outrage over the amount of personal information collected by Facebook and Google, genealogy sites have more or less stayed under the privacy concerns radar, until now.
Privacy vs. desire to find relatives
Since 1984, with the advent of a database called RootsWeb (now owned by Ancestry.com), genealogists became some of the first to recognize that the internet could be used to share information and to connect people. Genealogy as a hobby depends on people's eagerness to share personal information and genealogists are somewhat allergic to privacy constraints since privacy runs counter to the desire to find relatives.
GEDmatch is a public site organized by genealogy enthusiasts in the model of most non-profit genealogy groups. Everyone uploads information for the greater good of all. Registered members can upload their family tree DNA results from any commercial company, with or without their family trees in the industry standard GEDCOM file (Genealogical Data Communication, software developed by the Church of the Latter Day Saints).
The site processes the DNA and shows users relative matches, usually cousins, with email addresses attached —all good as long as you are a genealogist just looking for relatives who are also looking for relatives.
But, nothing prevents other kinds of users from accessing this information as well. GEDmatch seemed genuinely surprised that the police had used their database to track a killer and posted this disclaimer on their landing page for their users on April 27, 2018.
"We understand that the GEDmatch database was used to help identify the Golden State Killer. Although we were not approached by law enforcement or anyone else about this case or about the DNA, it has always been GEDmatch's policy to inform users that the database could be used for other uses, as set forth in the Site Policy."
That policy does indeed acknowledge the site could be used to track criminal relatives. Having a dead black sheep in the family can be a source of great family tales, but living miscreants are more of a problem.
"While the database was created for genealogical research, it is important that GEDmatch participants understand the possible uses of their DNA, including identification of relatives that have committed crimes or were victims of crimes," GEDmatch said in its statement.
A day later, the site posted a link that allowed users to easily remove all of their information, including DNA, family trees and registration information. However, removing your information does not mean that you will be forgotten. Here's where everyone involved in the genealogy industry, family historians, commercial providers and non-profit organizations alike, really need to do some hard thinking about DNA that is linked to family trees.
When you send your DNA to a commercial company for testing, or upload those results to community site, you are, by design, asking for your information to be shared and linked with every other user on the site. You can set privacy filters that will specify how much you want to reveal about your self —your name and contact information, for example.
But, the more you reveal on the site, the more family you will find; that's the lure and the promise. Once you are linked with other people and family trees, removing yourself is virtually impossible. At that point, you have no control who will add you to your tree or link your information in their GEDcom.
While catching DeAngelo, if he is the "Golden State Killer," is a huge victory for the public and the police, it's worth noting that he never uploaded anything.