Sampson Joshua N, Wheeler Bill, Li Peng, Shi Jianxin
Division of Cancer Epidemiology and Genetics, National Cancer Institute.
Information Management Services.
Ann Appl Stat. 2014 Jun;8(2):974-998. doi: 10.1214/14-aoas715.
Large case/control genome-wide association studies (GWAS) often include groups of related individuals with known relationships. When testing for associations at a given locus, current methods incorporate only the familial relationships between individuals. Here, we introduce the chromosome-based Quasi Likelihood Score (cQLS) statistic that incorporates local Identity-By-Descent (IBD) to increase the power to detect associations. In studies robust to population stratification, such as those with case/control sibling pairs, simulations show that the study power can be increased by over 50%. In our example, a GWAS examining late-onset Alzheimers disease, the p-values among the most strongly associated SNPs in the APOE gene tend to decrease, with the smallest p-value decreasing from 1.23 × 10 to 7.70 × 10. Furthermore, as a part of our simulations, we reevaluate our expectations about the use of families in GWAS. We show that, although adding only half as many unique chromosomes, genotyping affected siblings is more efficient than genotyping randomly ascertained cases. We also show that genotyping cases with a family history of disease will be less beneficial when searching for SNPs with smaller effect sizes.
大型病例/对照全基因组关联研究(GWAS)通常包含具有已知亲属关系的相关个体组。在检测给定基因座的关联性时,当前方法仅纳入个体之间的家族关系。在此,我们引入基于染色体的拟似然评分(cQLS)统计量,该统计量纳入局部同源性(IBD)以提高检测关联性的效能。在对群体分层具有稳健性的研究中,例如病例/对照同胞对研究,模拟结果表明研究效能可提高50%以上。在我们的示例中,一项针对晚发性阿尔茨海默病的GWAS研究中,载脂蛋白E(APOE)基因中关联性最强的单核苷酸多态性(SNP)的p值趋于降低,最小p值从1.23×10降至7.70×10。此外,作为模拟的一部分,我们重新评估了对GWAS中使用家族样本的预期。我们表明,尽管仅增加一半数量的独特染色体,但对受影响的同胞进行基因分型比随机确定的病例进行基因分型更有效。我们还表明,在寻找效应大小较小的SNP时,对有疾病家族史的病例进行基因分型的益处较小。