Department of Computer Science, Rice University, 6100 Main Street, Houston, TX 77005USA.
Genome Biol Evol. 2023 Jun 1;15(6). doi: 10.1093/gbe/evad094.
The evolutionary histories of individual loci in a genome can be estimated independently, but this approach is error-prone due to the limited amount of sequence data available for each gene, which has led to the development of a diverse array of gene tree error correction methods which reduce the distance to the species tree. We investigate the performance of two representatives of these methods: TRACTION and TreeFix. We found that gene tree error correction frequently increases the level of error in gene tree topologies by "correcting" them to be closer to the species tree, even when the true gene and species trees are discordant. We confirm that full Bayesian inference of the gene trees under the multispecies coalescent model is more accurate than independent inference. Future gene tree correction approaches and methods should incorporate an adequately realistic model of evolution instead of relying on oversimplified heuristics.
可以独立估计基因组中各个基因座的进化历史,但由于每个基因可用的序列数据有限,这种方法容易出错,这导致了一系列不同的基因树错误校正方法的发展,这些方法可以减少与物种树的距离。我们研究了两种代表性方法的性能:TRACTION 和 TreeFix。我们发现,即使当真实的基因树和物种树不一致时,基因树错误校正也经常通过“校正”它们更接近物种树,从而增加基因树拓扑结构的错误水平。我们确认,多物种合并模型下基因树的完全贝叶斯推断比独立推断更准确。未来的基因树校正方法和技术应该结合一个充分现实的进化模型,而不是依赖过于简化的启发式方法。