Wood Andrew R, Tuke Marcus A, Nalls Mike, Hernandez Dena, Gibbs J Raphael, Lin Haoxiang, Xu Christopher S, Li Qibin, Shen Juan, Jun Goo, Almeida Marcio, Tanaka Toshiko, Perry John R B, Gaulton Kyle, Rivas Manny, Pearson Richard, Curran Joanne E, Johnson Matthew P, Göring Harald H H, Duggirala Ravindranath, Blangero John, Mccarthy Mark I, Bandinelli Stefania, Murray Anna, Weedon Michael N, Singleton Andrew, Melzer David, Ferrucci Luigi, Frayling Timothy M
Genetics of Complex Traits, University of Exeter Medical School, Exeter, UK.
Laboratory of Neurogenetics, National Institute of Aging, Bethesda, MD, USA.
Hum Mol Genet. 2015 Mar 1;24(5):1504-12. doi: 10.1093/hmg/ddu560. Epub 2014 Nov 6.
Initial results from sequencing studies suggest that there are relatively few low-frequency (<5%) variants associated with large effects on common phenotypes. We performed low-pass whole-genome sequencing in 680 individuals from the InCHIANTI study to test two primary hypotheses: (i) that sequencing would detect single low-frequency-large effect variants that explained similar amounts of phenotypic variance as single common variants, and (ii) that some common variant associations could be explained by low-frequency variants. We tested two sets of disease-related common phenotypes for which we had statistical power to detect large numbers of common variant-common phenotype associations-11 132 cis-gene expression traits in 450 individuals and 93 circulating biomarkers in all 680 individuals. From a total of 11 657 229 high-quality variants of which 6 129 221 and 5 528 008 were common and low frequency (<5%), respectively, low frequency-large effect associations comprised 7% of detectable cis-gene expression traits [89 of 1314 cis-eQTLs at P < 1 × 10(-06) (false discovery rate ∼5%)] and one of eight biomarker associations at P < 8 × 10(-10). Very few (30 of 1232; 2%) common variant associations were fully explained by low-frequency variants. Our data show that whole-genome sequencing can identify low-frequency variants undetected by genotyping based approaches when sample sizes are sufficiently large to detect substantial numbers of common variant associations, and that common variant associations are rarely explained by single low-frequency variants of large effect.
测序研究的初步结果表明,与常见表型有较大影响相关的低频(<5%)变异相对较少。我们对来自基安蒂研究(InCHIANTI study)的680名个体进行了低覆盖度全基因组测序,以检验两个主要假设:(i)测序能够检测到单个低频大效应变异,这些变异解释的表型变异量与单个常见变异相当;(ii)一些常见变异关联可以由低频变异来解释。我们对两组疾病相关的常见表型进行了检测,对于这两组表型我们有统计学效力检测到大量的常见变异-常见表型关联——450名个体中的11132个顺式基因表达性状以及所有680名个体中的93种循环生物标志物。在总共11657229个高质量变异中,分别有6129221个和5528008个是常见变异和低频变异(<5%),低频大效应关联占可检测到的顺式基因表达性状的7%[1314个顺式eQTL中有89个在P < 1×10⁻⁶(错误发现率约为5%)],以及8个生物标志物关联中有1个在P < 8×10⁻¹⁰。很少有(1232个中的30个;2%)常见变异关联能被低频变异完全解释。我们的数据表明,当样本量足够大以检测到大量常见变异关联时,全基因组测序能够识别基于基因分型方法未检测到的低频变异,并且常见变异关联很少由单个大效应低频变异来解释。