Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA.
Am J Hum Genet. 2011 Sep 9;89(3):354-67. doi: 10.1016/j.ajhg.2011.07.015. Epub 2011 Sep 1.
Biological and empirical evidence suggests that rare variants account for a large proportion of the genetic contributions to complex human diseases. Recent technological advances in high-throughput sequencing platforms have made it possible for researchers to generate comprehensive information on rare variants in large samples. We provide a general framework for association testing with rare variants by combining mutation information across multiple variant sites within a gene and relating the enriched genetic information to disease phenotypes through appropriate regression models. Our framework covers all major study designs (i.e., case-control, cross-sectional, cohort and family studies) and all common phenotypes (e.g., binary, quantitative, and age at onset), and it allows arbitrary covariates (e.g., environmental factors and ancestry variables). We derive theoretically optimal procedures for combining rare mutations and construct suitable test statistics for various biological scenarios. The allele-frequency threshold can be fixed or variable. The effects of the combined rare mutations on the phenotype can be in the same direction or different directions. The proposed methods are statistically more powerful and computationally more efficient than existing ones. An application to a deep-resequencing study of drug targets led to a discovery of rare variants associated with total cholesterol. The relevant software is freely available.
生物和经验证据表明,稀有变异在复杂人类疾病的遗传贡献中占很大比例。高通量测序平台的最新技术进步使研究人员能够在大样本中生成有关稀有变异的全面信息。我们通过组合基因内多个变异位点的突变信息,并通过适当的回归模型将丰富的遗传信息与疾病表型相关联,为稀有变异的关联测试提供了一个通用框架。我们的框架涵盖了所有主要的研究设计(即病例对照、横断面、队列和家族研究)和所有常见的表型(例如,二项式、定量和发病年龄),并且允许任意协变量(例如,环境因素和祖先变量)。我们推导出了组合稀有突变的理论最优方法,并为各种生物学场景构建了合适的检验统计量。等位基因频率阈值可以是固定的或可变的。组合稀有突变对表型的影响可以在同一方向或不同方向上。与现有方法相比,所提出的方法在统计学上更强大,计算上更高效。对药物靶点深度测序研究的应用导致发现了与总胆固醇相关的稀有变异。相关软件是免费提供的。