• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

计算分子系统发育的可重复性。

Computational Reproducibility of Molecular Phylogenies.

机构信息

Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA.

Department of Biology, Temple University, Philadelphia, PA, USA.

出版信息

Mol Biol Evol. 2023 Jul 5;40(7). doi: 10.1093/molbev/msad165.

DOI:10.1093/molbev/msad165
PMID:37467477
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10370456/
Abstract

Repeated runs of the same program can generate different molecular phylogenies from identical data sets under the same analytical conditions. This lack of reproducibility of inferred phylogenies casts a long shadow on downstream research employing these phylogenies in areas such as comparative genomics, systematics, and functional biology. We have assessed the relative accuracies and log-likelihoods of alternative phylogenies generated for computer-simulated and empirical data sets. Our findings indicate that these alternative phylogenies reconstruct evolutionary relationships with comparable accuracy. They also have similar log-likelihoods that are not inferior to the log-likelihoods of the true tree. We determined that the direct relationship between irreproducibility and inaccuracy is due to their common dependence on the amount of phylogenetic information in the data. While computational reproducibility can be enhanced through more extensive heuristic searches for the maximum likelihood tree, this does not lead to higher accuracy. We conclude that computational irreproducibility plays a minor role in molecular phylogenetics.

摘要

在相同的分析条件下,同一数据集的同一程序重复运行可能会生成不同的分子系统发育树。这种推断系统发育树的不可重复性给下游研究带来了很大的影响,这些下游研究在比较基因组学、系统学和功能生物学等领域中使用了这些系统发育树。我们评估了替代系统发育树在计算机模拟和经验数据集上的相对准确性和对数似然值。我们的研究结果表明,这些替代系统发育树以相似的准确性重建了进化关系。它们的对数似然值也不低于真实树的对数似然值。我们确定不可重复性和不准确性之间的直接关系是由于它们共同依赖于数据中的系统发育信息量。虽然通过更广泛的启发式搜索来寻找最大似然树可以提高计算的可重复性,但这并不会导致更高的准确性。我们得出结论,计算不可重复性在分子系统发育学中只起次要作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/82d75b8729f4/msad165f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/7d6a2c754ee7/msad165f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/c159e4d6d5a6/msad165f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/9b887cf5c3e2/msad165f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/9bdfc227282d/msad165f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/3a67c913b84e/msad165f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/1092518c4c32/msad165f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/dd1d50433f19/msad165f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/6e22872a9cc3/msad165f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/82d75b8729f4/msad165f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/7d6a2c754ee7/msad165f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/c159e4d6d5a6/msad165f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/9b887cf5c3e2/msad165f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/9bdfc227282d/msad165f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/3a67c913b84e/msad165f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/1092518c4c32/msad165f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/dd1d50433f19/msad165f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/6e22872a9cc3/msad165f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a90a/10370456/82d75b8729f4/msad165f9.jpg

相似文献

1
Computational Reproducibility of Molecular Phylogenies.计算分子系统发育的可重复性。
Mol Biol Evol. 2023 Jul 5;40(7). doi: 10.1093/molbev/msad165.
2
An investigation of irreproducibility in maximum likelihood phylogenetic inference.最大似然系统发育推断中不可再现性的研究。
Nat Commun. 2020 Nov 30;11(1):6096. doi: 10.1038/s41467-020-20005-6.
3
Online Phylogenetics with matOptimize Produces Equivalent Trees and is Dramatically More Efficient for Large SARS-CoV-2 Phylogenies than de novo and Maximum-Likelihood Implementations.在线系统发育学与 matOptimize 产生等效的树,并且比从头开始和最大似然实现对大型 SARS-CoV-2 系统发育更有效率。
Syst Biol. 2023 Nov 1;72(5):1039-1051. doi: 10.1093/sysbio/syad031.
4
Quartet Sampling distinguishes lack of support from conflicting support in the green plant tree of life.四重奏抽样法区分了绿色植物生命之树中缺乏支持与相互冲突的支持。
Am J Bot. 2018 Mar;105(3):385-403. doi: 10.1002/ajb2.1016. Epub 2018 Feb 14.
5
Evaluating the relationship between evolutionary divergence and phylogenetic accuracy in AFLP data sets.评估 AFLP 数据集内进化分歧与系统发育准确性之间的关系。
Mol Biol Evol. 2010 May;27(5):988-1000. doi: 10.1093/molbev/msp315. Epub 2009 Dec 21.
6
Prospects for inferring very large phylogenies by using the neighbor-joining method.使用邻接法推断超大型系统发育树的前景。
Proc Natl Acad Sci U S A. 2004 Jul 27;101(30):11030-5. doi: 10.1073/pnas.0404206101. Epub 2004 Jul 16.
7
Data-specific substitution models improve protein-based phylogenetics.基于数据的替代模型可提高基于蛋白质的系统发育分析。
PeerJ. 2023 Aug 8;11:e15716. doi: 10.7717/peerj.15716. eCollection 2023.
8
A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates.在相等和不相等进化速率下系统发育算法的模拟比较。
Mol Biol Evol. 1994 May;11(3):459-68. doi: 10.1093/oxfordjournals.molbev.a040126.
9
Automated reconstruction of whole-genome phylogenies from short-sequence reads.从短序列读段自动重建全基因组系统发育树。
Mol Biol Evol. 2014 May;31(5):1077-88. doi: 10.1093/molbev/msu088. Epub 2014 Mar 5.
10
Phylogenies, the Comparative Method, and the Conflation of Tempo and Mode.系统发育、比较方法与时相和方式的合并。
Syst Biol. 2016 Jan;65(1):1-15. doi: 10.1093/sysbio/syv079. Epub 2015 Oct 15.

引用本文的文献

1
PsiPartition: Improved Site Partitioning for Genomic Data by Parameterized Sorting Indices and Bayesian Optimization.PsiPartition:通过参数化排序索引和贝叶斯优化改进基因组数据的位点划分
J Mol Evol. 2024 Dec;92(6):874-890. doi: 10.1007/s00239-024-10215-7. Epub 2024 Dec 5.

本文引用的文献

1
From Easy to Hopeless-Predicting the Difficulty of Phylogenetic Analyses.从简单到无望——预测系统发育分析的难度。
Mol Biol Evol. 2022 Dec 5;39(12). doi: 10.1093/molbev/msac254.
2
Sustainable computational science: the ReScience initiative.可持续计算科学:ReScience计划
PeerJ Comput Sci. 2017 Dec 18;3:e142. doi: 10.7717/peerj-cs.142. eCollection 2017.
3
MEGA11: Molecular Evolutionary Genetics Analysis Version 11.MEGA11:分子进化遗传学分析版本 11。
Mol Biol Evol. 2021 Jun 25;38(7):3022-3027. doi: 10.1093/molbev/msab120.
4
Phylogenetic Analysis of SARS-CoV-2 Data Is Difficult.对 SARS-CoV-2 数据进行系统发育分析很困难。
Mol Biol Evol. 2021 May 4;38(5):1777-1791. doi: 10.1093/molbev/msaa314.
5
An investigation of irreproducibility in maximum likelihood phylogenetic inference.最大似然系统发育推断中不可再现性的研究。
Nat Commun. 2020 Nov 30;11(1):6096. doi: 10.1038/s41467-020-20005-6.
6
Releasing uncurated datasets is essential for reproducible phylogenomics.发布未经整理的数据集对于可重复的系统发育基因组学至关重要。
Nat Ecol Evol. 2020 Nov;4(11):1435-1437. doi: 10.1038/s41559-020-01296-w.
7
IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era.IQ-TREE 2:基因组时代系统发育推断的新模型和有效方法。
Mol Biol Evol. 2020 May 1;37(5):1530-1534. doi: 10.1093/molbev/msaa015.
8
Large-scale ruminant genome sequencing provides insights into their evolution and distinct traits.大规模反刍动物基因组测序为它们的进化和独特特征提供了新的见解。
Science. 2019 Jun 21;364(6446). doi: 10.1126/science.aav6202.
9
RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference.RAxML-NG:用于最大似然系统发育推断的快速、可扩展和用户友好的工具。
Bioinformatics. 2019 Nov 1;35(21):4453-4455. doi: 10.1093/bioinformatics/btz305.
10
A Machine Learning Method for Detecting Autocorrelation of Evolutionary Rates in Large Phylogenies.一种用于检测大型系统发育树中进化率自相关性的机器学习方法。
Mol Biol Evol. 2019 Apr 1;36(4):811-824. doi: 10.1093/molbev/msz014.