Suppr超能文献

GTRpmix:一种用于轮廓混合模型的关联广义时间可逆模型。

GTRpmix: A Linked General Time-Reversible Model for Profile Mixture Models.

机构信息

Department of Mathematics, California State University San Bernardino, San Bernardino, CA, USA.

Department of Biochemistry and Molecular Biology, Faculty of Medicine, Dalhousie University, Halifax, NS, Canada.

出版信息

Mol Biol Evol. 2024 Sep 4;41(9). doi: 10.1093/molbev/msae174.

Abstract

Profile mixture models capture distinct biochemical constraints on the amino acid substitution process at different sites in proteins. These models feature a mixture of time-reversible models with a common matrix of exchangeabilities and distinct sets of equilibrium amino acid frequencies known as profiles. Combining the exchangeability matrix with each profile generates the matrix of instantaneous rates of amino acid exchange for that profile. Currently, empirically estimated exchangeability matrices (e.g. the LG matrix) are widely used for phylogenetic inference under profile mixture models. However, these were estimated using a single profile and are unlikely optimal for profile mixture models. Here, we describe the GTRpmix model that allows maximum likelihood estimation of a common exchangeability matrix under any profile mixture model. We show that exchangeability matrices estimated under profile mixture models differ from the LG matrix, dramatically improving model fit and topological estimation accuracy for empirical test cases. Because the GTRpmix model is computationally expensive, we provide two exchangeability matrices estimated from large concatenated phylogenomic-supermatrices to be used for phylogenetic analyses. One, called Eukaryotic Linked Mixture (ELM), is designed for phylogenetic analysis of proteins encoded by nuclear genomes of eukaryotes, and the other, Eukaryotic and Archaeal Linked mixture (EAL), for reconstructing relationships between eukaryotes and Archaea. These matrices, combined with profile mixture models, fit data better and have improved topology estimation relative to the LG matrix combined with the same mixture models. Starting with version 2.3.1, IQ-TREE2 allows users to estimate linked exchangeabilities (i.e. amino acid exchange rates) under profile mixture models.

摘要

轮廓混合模型捕捉到蛋白质不同位置上氨基酸替换过程的独特生化限制。这些模型的特点是混合了具有共同可交换矩阵和不同平衡氨基酸频率集(称为轮廓)的时间可逆模型。将可交换矩阵与每个轮廓相结合,为该轮廓生成瞬时氨基酸交换率矩阵。目前,经验估计的可交换矩阵(例如 LG 矩阵)广泛用于轮廓混合模型下的系统发育推断。然而,这些矩阵是使用单个轮廓估计的,不太可能是轮廓混合模型的最佳选择。在这里,我们描述了 GTRpmix 模型,该模型允许在任何轮廓混合模型下对共同可交换矩阵进行最大似然估计。我们表明,轮廓混合模型下估计的可交换矩阵与 LG 矩阵不同,极大地改善了经验测试案例的模型拟合和拓扑估计准确性。由于 GTRpmix 模型计算成本较高,我们提供了两个从大型串联基因组超级矩阵中估计的可交换矩阵,用于进行系统发育分析。一个称为真核生物链接混合(ELM),设计用于真核生物核基因组编码蛋白质的系统发育分析,另一个称为真核生物和古菌链接混合(EAL),用于重建真核生物和古菌之间的关系。这些矩阵与轮廓混合模型结合使用,可以更好地拟合数据,并且相对于 LG 矩阵与相同的混合模型结合使用时,拓扑估计有所改进。从版本 2.3.1 开始,IQ-TREE2 允许用户在轮廓混合模型下估计链接可交换性(即氨基酸交换率)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3b1/11371462/1a94074b91e3/msae174f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验