Queensland Institute of Medical Research, Herston, Queensland, Australia.
Genet Epidemiol. 2010 Dec;34(8):854-62. doi: 10.1002/gepi.20541.
The impact of erroneous genotypes having passed standard quality control (QC) can be severe in genome-wide association studies, genotype imputation, and estimation of heritability and prediction of genetic risk based on single nucleotide polymorphisms (SNP). To detect such genotyping errors, a simple two-locus QC method, based on the difference in test statistic of association between single SNPs and pairs of SNPs, was developed and applied. The proposed approach could detect many problematic SNPs with statistical significance even when standard single SNP QC analyses fail to detect them in real data. Depending on the data set used, the number of erroneous SNPs that were not filtered out by standard single SNP QC but detected by the proposed approach varied from a few hundred to thousands. Using simulated data, it was shown that the proposed method was powerful and performed better than other tested existing methods. The power of the proposed approach to detect erroneous genotypes was ∼80% for a 3% error rate per SNP. This novel QC approach is easy to implement and computationally efficient, and can lead to a better quality of genotypes for subsequent genotype-phenotype investigations.
在全基因组关联研究、基因型推断以及基于单核苷酸多态性 (SNP) 的遗传力估计和遗传风险预测中,通过标准质量控制 (QC) 的错误基因型的影响可能非常严重。为了检测这种基因分型错误,开发并应用了一种简单的两基因座 QC 方法,该方法基于单 SNP 和 SNP 对之间关联测试统计量的差异。即使在真实数据中标准单 SNP QC 分析未能检测到它们,所提出的方法也可以检测到许多具有统计学意义的有问题的 SNP。根据所用数据集的不同,标准单 SNP QC 未过滤掉但通过所提出的方法检测到的错误 SNP 的数量从几百到几千不等。使用模拟数据表明,所提出的方法功能强大,性能优于其他测试的现有方法。对于每个 SNP 的错误率为 3%,该方法检测错误基因型的功效约为 80%。这种新颖的 QC 方法易于实现且计算效率高,可导致随后进行的基因型-表型研究获得更高质量的基因型。