• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

混合体查找器:用于系统发育分析的DNA混合模型估计

MixtureFinder: Estimating DNA Mixture Models for Phylogenetic Analyses.

作者信息

Ren Huaiyan, Wong Thomas K F, Minh Bui Quang, Lanfear Robert

机构信息

School of Computing, College of Engineering, Computing and Cybernetics, Australian National University, Canberra, ACT 2600, Australia.

Ecology and Evolution, Research School of Biology, College of Science, Australian National University, Canberra, ACT 2600, Australia.

出版信息

Mol Biol Evol. 2025 Jan 6;42(1). doi: 10.1093/molbev/msae264.

DOI:10.1093/molbev/msae264
PMID:39715360
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11704958/
Abstract

In phylogenetic studies, both partitioned models and mixture models are used to account for heterogeneity in molecular evolution among the sites of DNA sequence alignments. Partitioned models require the user to specify the grouping of sites into subsets, and then assume that each subset of sites can be modeled by a single common process. Mixture models do not require users to prespecify subsets of sites, and instead calculate the likelihood of every site under every model, while co-estimating the model weights and parameters. While much research has gone into the optimization of partitioned models by merging user-specified subsets, there has been less attention paid to the optimization of mixture models for DNA sequence alignments. In this study, we first ask whether a key assumption of partitioned models-that each user-specified subset can be modeled by a single common process-is supported by the data. Having shown that this is not the case, we then design, implement, test, and apply an algorithm, MixtureFinder, to select the optimum number of classes for a mixture model of Q-matrices for the standard models of DNA sequence evolution. We show this algorithm performs well on simulated and empirical datasets and suggest that it may be useful for future empirical studies. MixtureFinder is available in IQ-TREE2, and a tutorial for using MixtureFinder can be found here: http://www.iqtree.org/doc/Complex-Models#mixture-models.

摘要

在系统发育研究中,划分模型和混合模型都用于解释DNA序列比对位点间分子进化的异质性。划分模型要求用户指定位点分组为子集,然后假设每个位点子集可以由一个共同过程建模。混合模型不要求用户预先指定位点子集,而是计算每个模型下每个位点的似然性,同时共同估计模型权重和参数。虽然已经有很多研究致力于通过合并用户指定的子集来优化划分模型,但对于DNA序列比对的混合模型优化关注较少。在本研究中,我们首先探讨划分模型的一个关键假设——每个用户指定的子集可以由一个共同过程建模——是否得到数据支持。在表明情况并非如此之后,我们接着设计、实现、测试并应用一种算法MixtureFinder,为DNA序列进化的标准模型选择Q矩阵混合模型的最优类别数。我们表明该算法在模拟和实证数据集上表现良好,并建议它可能对未来的实证研究有用。MixtureFinder可在IQ-TREE2中获取,使用MixtureFinder的教程可在此处找到:http://www.iqtree.org/doc/Complex-Models#mixture-models。

相似文献

1
MixtureFinder: Estimating DNA Mixture Models for Phylogenetic Analyses.混合体查找器:用于系统发育分析的DNA混合模型估计
Mol Biol Evol. 2025 Jan 6;42(1). doi: 10.1093/molbev/msae264.
2
QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution.QMaker:一种快速准确的蛋白质进化经验模型估计方法。
Syst Biol. 2021 Aug 11;70(5):1046-1060. doi: 10.1093/sysbio/syab010.
3
GHOST: Recovering Historical Signal from Heterotachously Evolved Sequence Alignments.GHOST:从异速进化的序列比对中恢复历史信号。
Syst Biol. 2020 Mar 1;69(2):249-264. doi: 10.1093/sysbio/syz051.
4
AliSim: A Fast and Versatile Phylogenetic Sequence Simulator for the Genomic Era.AliSim:基因组时代快速且通用的进化序列模拟器。
Mol Biol Evol. 2022 May 3;39(5). doi: 10.1093/molbev/msac092.
5
Using evolutionary Expectation Maximization to estimate indel rates.使用进化期望最大化算法来估计插入缺失率。
Bioinformatics. 2005 May 15;21(10):2294-300. doi: 10.1093/bioinformatics/bti177. Epub 2005 Feb 24.
6
GenNon-h: generating multiple sequence alignments on nonhomogeneous phylogenetic trees.GenNon-h:在非同源系统发育树上生成多重序列比对。
BMC Bioinformatics. 2012 Aug 28;13:216. doi: 10.1186/1471-2105-13-216.
7
A class frequency mixture model that adjusts for site-specific amino acid frequencies and improves inference of protein phylogeny.一种根据特定位点氨基酸频率进行调整并改进蛋白质系统发育推断的类频率混合模型。
BMC Evol Biol. 2008 Dec 16;8:331. doi: 10.1186/1471-2148-8-331.
8
Mixture models of nucleotide sequence evolution that account for heterogeneity in the substitution process across sites and across lineages.能够解释核苷酸序列进化过程中替换过程在各位置和各谱系间异质性的混合模型。
Syst Biol. 2014 Sep;63(5):726-42. doi: 10.1093/sysbio/syu036. Epub 2014 Jun 12.
9
PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny.PhyloGibbs:一种整合了系统发育的吉布斯采样基序查找器。
PLoS Comput Biol. 2005 Dec;1(7):e67. doi: 10.1371/journal.pcbi.0010067. Epub 2005 Dec 9.
10
Hetero: a program to simulate the evolution of DNA on a four-taxon tree.Hetero:一个用于模拟四分类群树上DNA进化的程序。
Appl Bioinformatics. 2003;2(3):159-63.

本文引用的文献

1
Complexity of avian evolution revealed by family-level genomes.鸟类进化的复杂性由家族水平基因组揭示。
Nature. 2024 May;629(8013):851-860. doi: 10.1038/s41586-024-07323-1. Epub 2024 Apr 1.
2
MAST: Phylogenetic Inference with Mixtures Across Sites and Trees.MAST:跨越站点和树的混合系统发育推断。
Syst Biol. 2024 Jul 27;73(2):375-391. doi: 10.1093/sysbio/syae008.
3
Is Over-parameterization a Problem for Profile Mixture Models?过参数化对轮廓混合模型是一个问题吗?
Syst Biol. 2024 May 27;73(1):53-75. doi: 10.1093/sysbio/syad063.
4
Nucleotide Substitution Model Selection Is Not Necessary for Bayesian Inference of Phylogeny With Well-Behaved Priors.对于具有良好先验的系统发育贝叶斯推断,核苷酸替换模型选择并非必要。
Syst Biol. 2023 Dec 30;72(6):1418-1432. doi: 10.1093/sysbio/syad041.
5
DNA Sequences Are as Useful as Protein Sequences for Inferring Deep Phylogenies.DNA 序列与蛋白质序列一样可用于推断深层系统发育。
Syst Biol. 2023 Nov 1;72(5):1119-1135. doi: 10.1093/sysbio/syad036.
6
From Easy to Hopeless-Predicting the Difficulty of Phylogenetic Analyses.从简单到无望——预测系统发育分析的难度。
Mol Biol Evol. 2022 Dec 5;39(12). doi: 10.1093/molbev/msac254.
7
AliSim: A Fast and Versatile Phylogenetic Sequence Simulator for the Genomic Era.AliSim:基因组时代快速且通用的进化序列模拟器。
Mol Biol Evol. 2022 May 3;39(5). doi: 10.1093/molbev/msac092.
8
Comparing Partitioned Models to Mixture Models: Do Information Criteria Apply?比较分区模型与混合模型:信息准则是否适用?
Syst Biol. 2022 Oct 12;71(6):1541-1548. doi: 10.1093/sysbio/syac003.
9
Assessing Confidence in Root Placement on Phylogenies: An Empirical Study Using Nonreversible Models for Mammals.评估系统发育树根定位的置信度:使用哺乳动物不可逆转模型的实证研究。
Syst Biol. 2022 Jun 16;71(4):959-972. doi: 10.1093/sysbio/syab067.
10
Evidence for sponges as sister to all other animals from partitioned phylogenomics with mixture models and recoding.基于混合模型和重新编码的分区系统发育基因组学得出的海绵是所有其他动物姐妹群的证据。
Nat Commun. 2021 Mar 19;12(1):1783. doi: 10.1038/s41467-021-22074-7.