Helmkamp Laura J, Jewett Ethan M, Rosenberg Noah A
Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.
J Comput Biol. 2012 Jun;19(6):632-49. doi: 10.1089/cmb.2012.0042.
Among the methods currently available for inferring species trees from gene trees, the GLASS method of Mossel and Roch (2010), the Shallowest Divergence (SD) method of Maddison and Knowles (2006), the STEAC method of Liu et al. (2009), and a related method that we call Minimum Average Coalescence (MAC) are computationally efficient and provide branch length estimates. Further, GLASS and STEAC have been shown to be consistent estimators of tree topology under a multispecies coalescent model. However, divergence time estimates obtained with these methods are all systematically biased under the model because the pairwise interspecific gene divergence times on which they rely must be more ancient than the species divergence time. Jewett and Rosenberg (2012) derived an expression for the bias of GLASS and used it to propose an improved method that they termed iGLASS. Here, we derive the biases of SD, STEAC, and MAC, and we propose improved analogues of these methods that we call iSD, iSTEAC, and iMAC. We conduct simulations to compare the performance of these methods with their original counterparts and with GLASS and iGLASS, finding that each of them decreases the bias and mean squared error of pairwise divergence time estimates. The new methods can therefore contribute to improvements in the estimation of species trees from information on gene trees.
在目前可用于从基因树推断物种树的方法中,莫塞尔和罗奇(2010年)提出的GLASS方法、麦迪逊和诺尔斯(2006年)提出的最浅分歧(SD)方法、刘等人(2009年)提出的STEAC方法,以及我们称为最小平均合并(MAC)的一种相关方法,在计算上效率较高,并能提供分支长度估计。此外,在多物种合并模型下,GLASS和STEAC已被证明是树拓扑结构的一致估计量。然而,在该模型下,用这些方法获得的分歧时间估计都存在系统偏差,因为它们所依赖的种间基因对分歧时间必定比物种分歧时间更古老。朱伊特和罗森伯格(2012年)推导了GLASS偏差的表达式,并据此提出了一种改进方法,他们称之为iGLASS。在此,我们推导了SD、STEAC和MAC的偏差,并提出了这些方法的改进类似方法,我们分别称之为iSD、iSTEAC和iMAC。我们进行了模拟,以比较这些方法与其原始对应方法以及GLASS和iGLASS的性能,发现它们每一种都降低了成对分歧时间估计的偏差和均方误差。因此,这些新方法有助于从基因树信息改进物种树的估计。