Lamina Claudia, Küchenhoff Helmut, Chang-Claude Jenny, Paulweber Bernhard, Wichmann H-Erich, Illig Thomas, Hoehe Margret R, Kronenberg Florian, Heid Iris M
Institute of Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany.
Ann Hum Genet. 2010 Sep 1;74(5):452-62. doi: 10.1111/j.1469-1809.2010.00593.x. Epub 2010 Jul 22.
Haplotypes are an important concept for genetic association studies, but involve uncertainty due to statistical reconstruction from single nucleotide polymorphism (SNP) genotypes and genotype error. We developed a re-sampling approach to quantify haplotype misclassification probabilities and implemented the MC-SIMEX approach to tackle this as a 3 x 3 misclassification problem. Using a previously published approach as a benchmark for comparison, we evaluated the performance of our approach by simulations and exemplified it on real data from 15 SNPs of the APM1 gene. Misclassification due to reconstruction error was small for most, but notable for some, especially rarer haplotypes. Genotype error added misclassification to all haplotypes resulting in a non-negligible drop in sensitivity. In our real data example, the bias of association estimates due to reconstruction error alone reached -48.2% for a 1% genotype error, indicating that haplotype misclassification should not be ignored if high genotype error can be expected. Our 3 x 3 misclassification view of haplotype error adds a novel perspective to currently used methods based on genotype intensities and expected number of haplotype copies. Our findings give a sense of the impact of haplotype error under realistic scenarios and underscore the importance of high-quality genotyping, in which case the bias in haplotype association estimates is negligible.
单倍型是基因关联研究中的一个重要概念,但由于从单核苷酸多态性(SNP)基因型进行统计重建以及基因型错误,存在不确定性。我们开发了一种重采样方法来量化单倍型错误分类概率,并采用MC - SIMEX方法将其作为一个3×3错误分类问题来处理。以之前发表的方法作为比较基准,我们通过模拟评估了我们方法的性能,并以APM1基因15个SNP的真实数据为例进行说明。由于重建错误导致的错误分类对大多数单倍型来说较小,但对一些单倍型,尤其是罕见单倍型来说较为显著。基因型错误会给所有单倍型增加错误分类,导致敏感性有不可忽视的下降。在我们的真实数据示例中,对于1%的基因型错误,仅由重建错误导致的关联估计偏差达到 - 48.2%,这表明如果预期基因型错误率较高,单倍型错误分类就不应被忽视。我们对单倍型错误的3×3错误分类观点为当前基于基因型强度和单倍型拷贝预期数量的方法增添了新视角。我们的研究结果揭示了现实场景下单倍型错误的影响,并强调了高质量基因分型的重要性,在这种情况下单倍型关联估计中的偏差可以忽略不计。