York Thomas L, Durrett Richard T, Tanksley Steven, Nielsen Rasmus
Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14850, USA.
Genet Res. 2005 Apr;85(2):159-68. doi: 10.1017/S0016672305007494.
There has recently been increased interest in the use of Markov Chain Monte Carlo (MCMC)-based Bayesian methods for estimating genetic maps. The advantage of these methods is that they can deal accurately with missing data and genotyping errors. Here we present an extension of the previous methods that makes the Bayesian method applicable to large data sets. We present an extensive simulation study examining the statistical properties of the method and comparing it with the likelihood method implemented in Mapmaker. We show that the Maximum A Posteriori (MAP) estimator of the genetic distances, corresponding to the maximum likelihood estimator, performs better than estimators based on the posterior expectation. We also show that while the performance is similar between Mapmaker and the MCMC-based method in the absence of genotyping errors, the MCMC-based method has a distinct advantage in the presence of genotyping errors. A similar advantage of the Bayesian method was not observed for missing data. We also re-analyse a recently published set of data from the eggplant and show that the use of the MCMC-based method leads to smaller estimates of genetic distances.
最近,人们对使用基于马尔可夫链蒙特卡罗(MCMC)的贝叶斯方法来估计遗传图谱的兴趣日益增加。这些方法的优点是能够准确处理缺失数据和基因分型错误。在此,我们对先前的方法进行了扩展,使贝叶斯方法适用于大型数据集。我们进行了广泛的模拟研究,考察该方法的统计特性,并将其与Mapmaker中实现的似然方法进行比较。我们表明,与最大似然估计器相对应的遗传距离的最大后验(MAP)估计器,比基于后验期望的估计器表现更好。我们还表明,在没有基因分型错误的情况下,Mapmaker和基于MCMC的方法性能相似,但在存在基因分型错误时,基于MCMC的方法具有明显优势。对于缺失数据,未观察到贝叶斯方法有类似优势。我们还重新分析了最近发表的一组茄子数据,并表明使用基于MCMC的方法会导致对遗传距离的估计值更小。