Department of Epidemiology and Biostatistics, Institute for Human Genetics, University of California, San Francisco, California 94158-9001, USA.
Genet Epidemiol. 2012 Sep;36(6):642-51. doi: 10.1002/gepi.21659. Epub 2012 Jul 16.
New sequencing technologies provide an opportunity for assessing the impact of rare and common variants on complex diseases. Several methods have been developed for evaluating rare variants, many of which use weighted collapsing to combine rare variants. Some approaches require arbitrary frequency thresholds below which to collapse alleles, and most assume that effect sizes for each collapsed variant are either the same or a function of minor allele frequency. Some methods also further assume that all rare variants are deleterious rather than protective. We expect that such assumptions will not hold in general, and as a result performance of these tests will be adversely affected. We propose a hierarchical model, implemented in the new program CHARM, to detect the joint signal from rare and common variants within a genomic region while properly accounting for linkage disequilibrium between variants. Our model explores the scale, rather than the center of the odds ratio distribution, allowing for both causative and protective effects. We use cross-validation to assess the evidence for association in a region. We use model averaging to widen the range of disease models under which we will have good power. To assess this approach, we simulate data under a range of disease models with effects at common and/or rare variants. Overall, our method had more power than other well-known rare variant approaches; it performed well when either only rare, or only common variants were causal, and better than other approaches when both common and rare variants contributed to disease.
新的测序技术为评估稀有和常见变异对复杂疾病的影响提供了机会。已经开发了许多用于评估稀有变异的方法,其中许多方法使用加权压缩来组合稀有变异。一些方法要求任意频率阈值低于该阈值以压缩等位基因,并且大多数方法假设每个压缩变体的效应大小要么相同,要么是次要等位基因频率的函数。一些方法还进一步假设所有稀有变体都是有害的而不是保护性的。我们预计一般情况下不会存在这种假设,因此这些测试的性能将受到不利影响。我们提出了一种层次模型,该模型在新程序 CHARM 中实现,用于在基因组区域内同时检测稀有和常见变体的联合信号,同时适当考虑变体之间的连锁不平衡。我们的模型探索了比例,而不是优势比分布的中心,允许因果和保护作用。我们使用交叉验证来评估区域内关联的证据。我们使用模型平均来扩大我们将具有良好功效的疾病模型范围。为了评估这种方法,我们在具有常见和/或稀有变异效应的一系列疾病模型下模拟数据。总体而言,我们的方法比其他著名的稀有变异方法具有更高的功效;当仅稀有或仅常见变体是因果关系时,它的表现良好,并且当常见和稀有变体都导致疾病时,它比其他方法表现更好。