Suppr超能文献

利用系谱信息样本估计基因型错误率——在基因芯片Mapping 10K阵列上的应用

Estimation of genotype error rate using samples with pedigree information--an application on the GeneChip Mapping 10K array.

作者信息

Hao Ke, Li Cheng, Rosenow Carsten, Hung Wong Wing

机构信息

Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 02115, USA.

出版信息

Genomics. 2004 Oct;84(4):623-30. doi: 10.1016/j.ygeno.2004.05.003.

Abstract

Currently, most analytical methods assume all observed genotypes are correct; however, it is clear that errors may reduce statistical power or bias inference in genetic studies. We propose procedures for estimating error rate in genetic analysis and apply them to study the GeneChip Mapping 10K array, which is a technology that has recently become available and allows researchers to survey over 10,000 SNPs in a single assay. We employed a strategy to estimate the genotype error rate in pedigree data. First, the "dose-response" reference curve between error rate and the observable error number were derived by simulation, conditional on given pedigree structures and genotypes. Second, the error rate was estimated by calibrating the number of observed errors in real data to the reference curve. We evaluated the performance of this method by simulation study and applied it to a data set of 30 pedigrees genotyped using the GeneChip Mapping 10K array. This method performed favorably in all scenarios we surveyed. The dose-response reference curve was monotone and almost linear with a large slope. The method was able to estimate accurately the error rate under various pedigree structures and error models and under heterogeneous error rates. Using this method, we found that the average genotyping error rate of the GeneChip Mapping 10K array was about 0.1%. Our method provides a quick and unbiased solution to address the genotype error rate in pedigree data. It behaves well in a wide range of settings and can be easily applied in other genetic projects. The robust estimation of genotyping error rate allows us to estimate power and sample size and conduct unbiased genetic tests. The GeneChip Mapping 10K array has a low overall error rate, which is consistent with the results obtained from alternative genotyping assays.

摘要

目前,大多数分析方法都假定所有观察到的基因型是正确的;然而,很明显,错误可能会降低遗传研究中的统计效力或导致推理偏差。我们提出了在遗传分析中估计错误率的方法,并将其应用于研究基因芯片映射10K阵列,这是一项最近可用的技术,使研究人员能够在一次检测中对超过10000个单核苷酸多态性(SNP)进行检测。我们采用了一种策略来估计系谱数据中的基因型错误率。首先,在给定的系谱结构和基因型条件下,通过模拟得出错误率与可观察到的错误数量之间的“剂量反应”参考曲线。其次,通过将实际数据中观察到的错误数量与参考曲线进行校准来估计错误率。我们通过模拟研究评估了该方法的性能,并将其应用于使用基因芯片映射10K阵列进行基因分型的30个系谱的数据集。在我们调查的所有情况下,该方法都表现良好。剂量反应参考曲线是单调的,并且几乎是线性的,斜率很大。该方法能够在各种系谱结构和错误模型下以及不同错误率情况下准确估计错误率。使用该方法,我们发现基因芯片映射10K阵列的平均基因分型错误率约为0.1%。我们的方法为解决系谱数据中的基因型错误率提供了一种快速且无偏差的解决方案。它在广泛的设置中表现良好,并且可以很容易地应用于其他遗传项目。对基因分型错误率的稳健估计使我们能够估计效力和样本量,并进行无偏差的基因检测。基因芯片映射10K阵列的总体错误率较低,这与从其他基因分型检测获得的结果一致。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验