Department of Epidemiology, Harvard School of Public Health, Boston, MA 02115, USA.
Am J Hum Genet. 2010 Jun 11;86(6):832-8. doi: 10.1016/j.ajhg.2010.04.005. Epub 2010 May 13.
Deep sequencing will soon generate comprehensive sequence information in large disease samples. Although the power to detect association with an individual rare variant is limited, pooling variants by gene or pathway into a composite test provides an alternative strategy for identifying susceptibility genes. We describe a statistical method for detecting association of multiple rare variants in protein-coding genes with a quantitative or dichotomous trait. The approach is based on the regression of phenotypic values on individuals' genotype scores subject to a variable allele-frequency threshold, incorporating computational predictions of the functional effects of missense variants. Statistical significance is assessed by permutation testing with variable thresholds. We used a rigorous population-genetics simulation framework to evaluate the power of the method, and we applied the method to empirical sequencing data from three disease studies.
深度测序很快将在大型疾病样本中生成全面的序列信息。虽然检测个体罕见变异关联的能力有限,但通过将基因或途径中的变异汇集到复合测试中,为鉴定易感基因提供了一种替代策略。我们描述了一种用于检测蛋白质编码基因中多个罕见变异与定量或二分类性状关联的统计方法。该方法基于个体基因型得分与可变等位基因频率阈值的表型值回归,结合错义变异功能效应的计算预测。通过可变阈值的置换检验评估统计显著性。我们使用严格的群体遗传学模拟框架来评估该方法的功效,并将该方法应用于来自三个疾病研究的经验测序数据。