• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

病毒分子流行病学中系统发育工作流程的评估。

An Evaluation of Phylogenetic Workflows in Viral Molecular Epidemiology.

机构信息

Department of Computer Science & Engineering, University of California San Diego, La Jolla, CA 92093, USA.

出版信息

Viruses. 2022 Apr 8;14(4):774. doi: 10.3390/v14040774.

DOI:10.3390/v14040774
PMID:35458504
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9032411/
Abstract

The use of viral sequence data to inform public health intervention has become increasingly common in the realm of epidemiology. Such methods typically utilize multiple sequence alignments and phylogenies estimated from the sequence data. Like all estimation techniques, they are error prone, yet the impacts of such imperfections on downstream epidemiological inferences are poorly understood. To address this, we executed multiple commonly used viral phylogenetic analysis workflows on simulated viral sequence data, modeling Human Immunodeficiency Virus (HIV), Hepatitis C Virus (HCV), and Ebolavirus, and we computed multiple methods of accuracy, motivated by transmission-clustering techniques. For multiple sequence alignment, MAFFT consistently outperformed MUSCLE and Clustal Omega, in both accuracy and runtime. For phylogenetic inference, FastTree 2, IQ-TREE, RAxML-NG, and PhyML had similar topological accuracies, but branch lengths and pairwise distances were consistently most accurate in phylogenies inferred by RAxML-NG. However, FastTree 2 was the fastest, by orders of magnitude, and when the other tools were used to optimize branch lengths along a fixed FastTree 2 topology, the resulting phylogenies had accuracies that were indistinguishable from their original counterparts, but with a fraction of the runtime.

摘要

利用病毒序列数据为公共卫生干预提供信息,在流行病学领域已经变得越来越普遍。这些方法通常利用来自序列数据的多重序列比对和系统发育估计。与所有估计技术一样,它们容易出错,但这些不完美对下游流行病学推断的影响还了解甚少。为了解决这个问题,我们对模拟的病毒序列数据执行了多个常用的病毒系统发育分析工作流程,模拟了人类免疫缺陷病毒(HIV)、丙型肝炎病毒(HCV)和埃博拉病毒,并根据传播聚类技术计算了多种准确性方法。对于多重序列比对,MAFFT 在准确性和运行时间方面始终优于 MUSCLE 和 Clustal Omega。对于系统发育推断,FastTree 2、IQ-TREE、RAxML-NG 和 PhyML 的拓扑准确性相似,但在 RAxML-NG 推断的系统发育中,分支长度和成对距离始终最准确。然而,FastTree 2 的速度快了好几个数量级,当使用其他工具沿着固定的 FastTree 2 拓扑优化分支长度时,所得系统发育的准确性与原始系统发育无法区分,但运行时间却大大缩短。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d4e/9032411/f2c2455782b7/viruses-14-00774-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d4e/9032411/698b025e60da/viruses-14-00774-g001a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d4e/9032411/6d314d13ab50/viruses-14-00774-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d4e/9032411/815b77e192f9/viruses-14-00774-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d4e/9032411/75f7cc30f856/viruses-14-00774-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d4e/9032411/f2c2455782b7/viruses-14-00774-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d4e/9032411/698b025e60da/viruses-14-00774-g001a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d4e/9032411/6d314d13ab50/viruses-14-00774-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d4e/9032411/815b77e192f9/viruses-14-00774-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d4e/9032411/75f7cc30f856/viruses-14-00774-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d4e/9032411/f2c2455782b7/viruses-14-00774-g005.jpg

相似文献

1
An Evaluation of Phylogenetic Workflows in Viral Molecular Epidemiology.病毒分子流行病学中系统发育工作流程的评估。
Viruses. 2022 Apr 8;14(4):774. doi: 10.3390/v14040774.
2
RAxML and FastTree: comparing two methods for large-scale maximum likelihood phylogeny estimation.RAxML 和 FastTree:比较两种大规模最大似然系统发育估计方法。
PLoS One. 2011;6(11):e27731. doi: 10.1371/journal.pone.0027731. Epub 2011 Nov 21.
3
Evaluating Fast Maximum Likelihood-Based Phylogenetic Programs Using Empirical Phylogenomic Data Sets.使用经验系统发育基因组数据集评估基于快速最大似然的系统发育程序。
Mol Biol Evol. 2018 Feb 1;35(2):486-503. doi: 10.1093/molbev/msx302.
4
FastTree 2--approximately maximum-likelihood trees for large alignments.FastTree 2--用于大型比对的近似最大似然树。
PLoS One. 2010 Mar 10;5(3):e9490. doi: 10.1371/journal.pone.0009490.
5
The effect of the guide tree on multiple sequence alignments and subsequent phylogenetic analyses.引导树对多序列比对及后续系统发育分析的影响。
Pac Symp Biocomput. 2008:25-36. doi: 10.1142/9789812776136_0004.
6
PHYRN: a robust method for phylogenetic analysis of highly divergent sequences.PHYRN:一种用于高度分化序列的稳健系统发育分析方法。
PLoS One. 2012;7(4):e34261. doi: 10.1371/journal.pone.0034261. Epub 2012 Apr 13.
7
Prediction of missing sequences and branch lengths in phylogenomic data.系统发育组学数据中缺失序列和分支长度的预测
Bioinformatics. 2016 May 1;32(9):1331-7. doi: 10.1093/bioinformatics/btv768. Epub 2016 Jan 5.
8
Multiple sequence alignment accuracy and phylogenetic inference.多序列比对准确性和系统发育推断
Syst Biol. 2006 Apr;55(2):314-28. doi: 10.1080/10635150500541730.
9
PhyPA: Phylogenetic method with pairwise sequence alignment outperforms likelihood methods in phylogenetics involving highly diverged sequences.PhyPA:一种结合成对序列比对的系统发育方法,在涉及高度分化序列的系统发育分析中,其性能优于似然法。
Mol Phylogenet Evol. 2016 Sep;102:331-43. doi: 10.1016/j.ympev.2016.07.001. Epub 2016 Jul 1.
10
Phylogeny Estimation Given Sequence Length Heterogeneity.给定序列长度异质性的系统发育估计。
Syst Biol. 2021 Feb 10;70(2):268-282. doi: 10.1093/sysbio/syaa058.

引用本文的文献

1
Scalable Epidemic Simulation Using FAVITES-Lite.使用FAVITES-Lite进行可扩展的流行病模拟。
Methods Mol Biol. 2025;2927:173-193. doi: 10.1007/978-1-0716-4546-8_10.
2
PRRSV-2 variant classification: a dynamic nomenclature for enhanced monitoring and surveillance.猪繁殖与呼吸综合征病毒2型变异体分类:用于加强监测和监督的动态命名法。
mSphere. 2025 Feb 25;10(2):e0070924. doi: 10.1128/msphere.00709-24. Epub 2025 Jan 23.
3
SARS-CoV-2: Two Years in the Pandemic: What Have We Observed from Genome Sequencing Results in Lithuania?严重急性呼吸综合征冠状病毒2:大流行两年:我们从立陶宛的基因组测序结果中观察到了什么?

本文引用的文献

1
Concordance of HIV transmission risk factors elucidated using viral diversification rate and phylogenetic clustering.利用病毒多样化率和系统发育聚类阐明的HIV传播风险因素的一致性。
Evol Med Public Health. 2021 Sep 23;9(1):338-348. doi: 10.1093/emph/eoab028. eCollection 2021.
2
TreeCluster: Clustering biological sequences using phylogenetic trees.TreeCluster:使用系统发生树进行生物序列聚类。
PLoS One. 2019 Aug 22;14(8):e0221068. doi: 10.1371/journal.pone.0221068. eCollection 2019.
3
RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference.
Microorganisms. 2022 Jun 16;10(6):1229. doi: 10.3390/microorganisms10061229.
RAxML-NG:用于最大似然系统发育推断的快速、可扩展和用户友好的工具。
Bioinformatics. 2019 Nov 1;35(21):4453-4455. doi: 10.1093/bioinformatics/btz305.
4
HIV-TRACE (TRAnsmission Cluster Engine): a Tool for Large Scale Molecular Epidemiology of HIV-1 and Other Rapidly Evolving Pathogens.HIV-TRACE(传播簇引擎):一种用于 HIV-1 和其他快速进化病原体的大规模分子流行病学的工具。
Mol Biol Evol. 2018 Jul 1;35(7):1812-1819. doi: 10.1093/molbev/msy016.
5
Evaluating Fast Maximum Likelihood-Based Phylogenetic Programs Using Empirical Phylogenomic Data Sets.使用经验系统发育基因组数据集评估基于快速最大似然的系统发育程序。
Mol Biol Evol. 2018 Feb 1;35(2):486-503. doi: 10.1093/molbev/msx302.
6
Minimum variance rooting of phylogenetic trees and implications for species tree reconstruction.系统发育树的最小方差生根及其对物种树重建的影响。
PLoS One. 2017 Aug 11;12(8):e0182238. doi: 10.1371/journal.pone.0182238. eCollection 2017.
7
ModelFinder: fast model selection for accurate phylogenetic estimates.ModelFinder:用于准确系统发育估计的快速模型选择
Nat Methods. 2017 Jun;14(6):587-589. doi: 10.1038/nmeth.4285. Epub 2017 May 8.
8
Terrace Aware Data Structure for Phylogenomic Inference from Supermatrices.用于从超级矩阵进行系统发育基因组推断的分层感知数据结构
Syst Biol. 2016 Nov;65(6):997-1008. doi: 10.1093/sysbio/syw037. Epub 2016 Apr 26.
9
Multiple sequence alignment modeling: methods and applications.多序列比对建模:方法与应用
Brief Bioinform. 2016 Nov;17(6):1009-1023. doi: 10.1093/bib/bbv099. Epub 2015 Nov 27.
10
Automated analysis of phylogenetic clusters.系统发育聚类的自动分析。
BMC Bioinformatics. 2013 Nov 6;14:317. doi: 10.1186/1471-2105-14-317.