Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA 90095-7088, USA.
Genet Epidemiol. 2011;35 Suppl 1(Suppl 1):S85-91. doi: 10.1002/gepi.20656.
We summarize the work done by the contributors to Group 13 at Genetic Analysis Workshop 17 (GAW17) and provide a synthesis of their data analyses. The Group 13 contributors used a variety of approaches to test associations of both rare variants and common single-nucleotide polymorphisms (SNPs) with the GAW17 simulated traits, implementing analytic methods that incorporate multiallelic genotypes and haplotypes. In addition to using a wide variety of statistical methods and approaches, the contributors exhibited a remarkable amount of flexibility and creativity in coding the variants and their genes and in evaluating their proposed approaches and methods. We describe and contrast their methods along three dimensions: (1) selection and coding of genetic entities for analysis, (2) method of analysis, and (3) evaluation of the results. The contributors consistently presented a strong rationale for using multiallelic analytic approaches. They indicated that power was likely to be increased by capturing the signals of multiple markers within genetic entities defined by sliding windows, haplotypes, genes, functional pathways, and the entire set of SNPs and rare variants taken in aggregate. Despite this variability, the methods were fairly consistent in their ability to identify two associated genes for each simulated trait. The first gene was selected for the largest number of causal alleles and the second for a high-frequency causal SNP. The presumed model of inheritance and choice of genetic entities are likely to have a strong effect on the outcomes of the analyses.
我们总结了第 17 届遗传分析工作坊(GAW17)第 13 组贡献者的工作,并对他们的数据分析进行了综合。第 13 组贡献者使用了各种方法来检验罕见变异和常见单核苷酸多态性(SNP)与 GAW17 模拟特征之间的关联,实施了包含多等位基因型和单倍型的分析方法。除了使用各种统计方法和方法外,贡献者在对变体及其基因进行编码以及评估他们提出的方法和方法时表现出了极大的灵活性和创造力。我们沿着三个维度描述和对比他们的方法:(1)用于分析的遗传实体的选择和编码,(2)分析方法,以及(3)结果评估。贡献者始终为使用多等位分析方法提供了强有力的理由。他们指出,通过在滑动窗口、单倍型、基因、功能途径以及整个 SNP 和罕见变体集中捕获遗传实体中的多个标记信号,可能会增加功率。尽管存在这种可变性,但这些方法在识别每个模拟特征的两个相关基因方面相当一致。第一个基因是为最多数量的因果等位基因选择的,第二个基因是为高频率因果 SNP 选择的。假定的遗传模式和遗传实体的选择很可能对分析结果产生强烈影响。