Fridley Brooke L, McDonnell Shannon K, Rabe Kari G, Tang Rui, Biernacka Joanna M, Sinnwell Jason P, Rider David N, Goode Ellen L
Department of Health Sciences Research, Mayo Clinic, 200 First Street Southwest, Rochester, MN 55905, USA.
BMC Proc. 2009 Dec 15;3 Suppl 7(Suppl 7):S7. doi: 10.1186/1753-6561-3-s7-s7.
Due to the growing need to combine data across multiple studies and to impute untyped markers based on a reference sample, several analytical tools for imputation and analysis of missing genotypes have been developed. Current imputation methods rely on single imputation, which ignores the variation in estimation due to imputation. An alternative to single imputation is multiple imputation. In this paper, we assess the variation in imputation by completing both single and multiple imputations of genotypic data using MACH, a commonly used hidden Markov model imputation method. Using data from the North American Rheumatoid Arthritis Consortium genome-wide study, the use of single and multiple imputation was assessed in four regions of chromosome 1 with varying levels of linkage disequilibrium and association signals. Two scenarios for missing genotypic data were assessed: imputation of untyped markers and combination of genotypic data from two studies. This limited study involving four regions indicates that, contrary to expectations, multiple imputations may not be necessary.
由于整合多个研究数据以及基于参考样本推算未分型标记的需求不断增加,已经开发了几种用于推算和分析缺失基因型的分析工具。当前的推算方法依赖于单一推算,这种方法忽略了因推算导致的估计值变化。单一推算的替代方法是多重推算。在本文中,我们使用常用的隐马尔可夫模型推算方法MACH对基因型数据进行单一和多重推算,以评估推算中的变化。利用来自北美类风湿关节炎联盟全基因组研究的数据,在1号染色体的四个具有不同连锁不平衡水平和关联信号的区域评估了单一和多重推算的使用情况。评估了两种缺失基因型数据的情况:未分型标记的推算以及两项研究的基因型数据的合并。这项涉及四个区域的有限研究表明,与预期相反,多重推算可能并非必要。