Suppr超能文献

系统发育数据可归并性的似然比检验:进化过程的马尔可夫性质在编码DNA中是否得以保留?

A Likelihood-Ratio Test for Lumpability of Phylogenetic Data: Is the Markovian Property of an Evolutionary Process Retained in Recoded DNA?

作者信息

Vera-Ruiz Victor A, Robinson John, Jermiin Lars S

机构信息

School of Mathematics and Statistics, University of Sydney, NSW 2006, Australia.

Department of Mathematics and Statistics, University of Nevada, Reno, NV 89557, USA.

出版信息

Syst Biol. 2022 Apr 19;71(3):660-675. doi: 10.1093/sysbio/syab074.

Abstract

In molecular phylogenetics, it is typically assumed that the evolutionary process for DNA can be approximated by independent and identically distributed Markovian processes at the variable sites and that these processes diverge over the edges of a rooted bifurcating tree. Sometimes the nucleotides are transformed from a 4-state alphabet to a 3- or 2-state alphabet by a procedure that is called recoding, lumping, or grouping of states. Here, we introduce a likelihood-ratio test for lumpability for DNA that has diverged under different Markovian conditions, which assesses the assumption that the Markovian property of the evolutionary process over each edge is retained after recoding of the nucleotides. The test is derived and validated numerically on simulated data. To demonstrate the insights that can be gained by using the test, we assessed two published data sets, one of mitochondrial DNA from a phylogenetic study of the ratites and the other of nuclear DNA from a phylogenetic study of yeast. Our analysis of these data sets revealed that recoding of the DNA eliminated some of the compositional heterogeneity detected over the sequences. However, the Markovian property of the original evolutionary process was not retained by the recoding, leading to some significant distortions of edge lengths in reconstructed trees.[Evolutionary processes; likelihood-ratio test; lumpability; Markovian processes; Markov models; phylogeny; recoding of nucleotides.].

摘要

在分子系统发育学中,通常假定DNA的进化过程可以通过可变位点处独立且同分布的马尔可夫过程来近似,并且这些过程在有根二叉树的边上发生分歧。有时,核苷酸会通过一种称为状态重新编码、合并或分组的程序从四状态字母表转换为三状态或二状态字母表。在此,我们针对在不同马尔可夫条件下发生分歧的DNA引入了一种关于可合并性的似然比检验,该检验评估了核苷酸重新编码后每条边上进化过程的马尔可夫性质是否得以保留这一假设。该检验通过数值方法在模拟数据上进行了推导和验证。为了展示使用该检验所能获得的见解,我们评估了两个已发表的数据集,一个是来自平胸总目系统发育研究的线粒体DNA数据集,另一个是来自酵母系统发育研究的核DNA数据集。我们对这些数据集的分析表明,DNA的重新编码消除了序列中检测到的一些组成异质性。然而,重新编码并未保留原始进化过程的马尔可夫性质,导致重建树中边长度出现一些显著扭曲。[进化过程;似然比检验;可合并性;马尔可夫过程;马尔可夫模型;系统发育;核苷酸重新编码。]

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验