Largest human exomes data reveals an excess of low frequency non-synonymous coding variants

Oct 05, 2010

In a paper appearing in Nature Genetics today, an international research group reported the resequencing and analysis of 200 human exomes, established the largest data set for human exomes published so far and reveal an excess of low frequency deleterious non-synonymous genetic mutations. The collabrative team includes investigators from BGI-Shenzhen, UC Berkeley, University of Copenhagen and some other european institutions.

The team used NimbleGen 2.1M exon capture array to targeted capture 18,654 coding genes of human and sequenced 200 individuals from Denmark. The average sequencing depth for each exome is 12X coverage and about 95% of targeted regions were covered by at least 1 read. In total, 121,870 SNPs were identified in the population, about 44% was novel SNPs. 53,081 coding SNPs (cSNPs), 25,275 synonymous and 27,806 non-synonymous, were identified, of which 42.6% were novel.

Based on the large population data, statistical analysis was performed for SNP calling and calculate distribution of allele frequencies. The allele frequency spectrum of cSNPs with a minor allele frequency > 2% was developed to exclude false positive SNPs. By comparing the distribution of allele frequencies among non-synonymous and synonymous cSNPs, a 1.8 fold excess of deleterious, non-syonomyous over synonymous cSNPs was identified in the low allele frequency range between 2-5%. Moreover, this excess was higher for SNPs, suggesting that deleterious mutations on the X chromsome are primarily recessive. The team further analyzed the potential effects of methylation over allele frequencies by comparing the frequency distribution for sites potentially affected by CpG methylation or with unaffected sites, where no strong effect was detected at a genome-wide scale.

The study provides an valuable data set for studying the allele frequency specturm and population genetic patterns, said Dr Yingrui Li, the project investigator from BGI-Shenzhen. We found more low frequency deleterious mutations in coding regions than previously expected, and most of them are recessive, thus we support the idea that much of the heritable variation affecting fitness is caused by low frequency mutations.

Association studies have only detect limited heritable variation associated with common polygenic traits and genotyping analysis generally overlooks the effects of low frequency mutations. The results obtained in this study further demonstrate that exome sequencing is an effective and promising approach to identify genetic variants associated with human traits and study population genetics. The team expects that Future analyses of non-coding regions and ethnically diverse samples will help build a complete picture of human genomic variation and an understanding of the interaction between genetic drift, mutation, recombination, and selection in the human genome.

Previouly, a paper in Science (Science. 2010 July; 329(5987): 75-78) reported sequencing the exomes of 50 Tibetan individuals and found evidence for high altitude adapdation of Tibetan populations. It shows that next generaton sequencing is getting more applications and will have great potential in genomics research, drug discovery and personalized medical treatment.

Explore further: Genetic pre-disposition toward exercise and mental development may be linked

add to favorites email to friend print save as pdf

Related Stories

Researchers sequence exomes of 12 people (w/ Video)

Aug 16, 2009

In a pioneering effort that generated massive amounts of DNA sequence data from 12 people, a team supported by the National Institutes of Health (NIH) has demonstrated the feasibility and value of a new strategy for identifying ...

Epigenetic signals differ across alleles

Feb 12, 2010

Researchers from the Institute of Psychiatry (IoP), King's College London, have identified numerous novel regions of the genome where the chemical modifications involved in controlling gene expression are influenced by either ...

Recommended for you

New therapy against rare gene defects

23 hours ago

On 15th April is the 1st International Pompe Disease Day, a campaign to raise awareness of this rare but severe gene defect. Pompe Disease is only one of more than 40 metabolic disorders that mainly affect children under ...

Splice variants reveal connections among autism genes

Apr 11, 2014

A team of researchers from the University of California, San Diego School of Medicine and the Center for Cancer Systems Biology (CCSB) at the Dana-Farber Cancer Institute has uncovered a new aspect of autism, ...

User comments : 0

More news stories

ESO image: A study in scarlet

This new image from ESO's La Silla Observatory in Chile reveals a cloud of hydrogen called Gum 41. In the middle of this little-known nebula, brilliant hot young stars are giving off energetic radiation that ...

First direct observations of excitons in motion achieved

A quasiparticle called an exciton—responsible for the transfer of energy within devices such as solar cells, LEDs, and semiconductor circuits—has been understood theoretically for decades. But exciton movement within ...

Patent talk: Google sharpens contact lens vision

( —A report from Patent Bolt brings us one step closer to what Google may have in mind in developing smart contact lenses. According to the discussion Google is interested in the concept of contact ...

Warm US West, cold East: A 4,000-year pattern

Last winter's curvy jet stream pattern brought mild temperatures to western North America and harsh cold to the East. A University of Utah-led study shows that pattern became more pronounced 4,000 years ago, ...