• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用简约法的在线系统发育分析生成的树稍好一些,并且对于大型严重急性呼吸综合征冠状病毒2(SARS-CoV-2)系统发育分析而言,其效率比距离法和最大似然法显著更高。

Online Phylogenetics using Parsimony Produces Slightly Better Trees and is Dramatically More Efficient for Large SARS-CoV-2 Phylogenies than and Maximum-Likelihood Approaches.

作者信息

Thornlow Bryan, Kramer Alexander, Ye Cheng, De Maio Nicola, McBroome Jakob, Hinrichs Angie S, Lanfear Robert, Turakhia Yatish, Corbett-Detig Russell

机构信息

Department of Biomolecular Engineering, University of California, Santa Cruz; Santa Cruz, CA 95064, USA.

Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA 95064, USA.

出版信息

bioRxiv. 2022 May 18:2021.12.02.471004. doi: 10.1101/2021.12.02.471004.

DOI:10.1101/2021.12.02.471004
PMID:35611334
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9128781/
Abstract

Phylogenetics has been foundational to SARS-CoV-2 research and public health policy, assisting in genomic surveillance, contact tracing, and assessing emergence and spread of new variants. However, phylogenetic analyses of SARS-CoV-2 have often relied on tools designed for phylogenetic inference, in which all data are collected before any analysis is performed and the phylogeny is inferred once from scratch. SARS-CoV-2 datasets do not fit this mould. There are currently over 10 million sequenced SARS-CoV-2 genomes in online databases, with tens of thousands of new genomes added every day. Continuous data collection, combined with the public health relevance of SARS-CoV-2, invites an "online" approach to phylogenetics, in which new samples are added to existing phylogenetic trees every day. The extremely dense sampling of SARS-CoV-2 genomes also invites a comparison between likelihood and parsimony approaches to phylogenetic inference. Maximum likelihood (ML) methods are more accurate when there are multiple changes at a single site on a single branch, but this accuracy comes at a large computational cost, and the dense sampling of SARS-CoV-2 genomes means that these instances will be extremely rare because each internal branch is expected to be extremely short. Therefore, it may be that approaches based on maximum parsimony (MP) are sufficiently accurate for reconstructing phylogenies of SARS-CoV-2, and their simplicity means that they can be applied to much larger datasets. Here, we evaluate the performance of and online phylogenetic approaches, and ML and MP frameworks, for inferring large and dense SARS-CoV-2 phylogenies. Overall, we find that online phylogenetics produces similar phylogenetic trees to analyses for SARS-CoV-2, and that MP optimizations produce more accurate SARS-CoV-2 phylogenies than do ML optimizations. Since MP is thousands of times faster than presently available implementations of ML and online phylogenetics is faster than , we therefore propose that, in the context of comprehensive genomic epidemiology of SARS-CoV-2, MP online phylogenetics approaches should be favored.

摘要

系统发育学一直是新冠病毒研究和公共卫生政策的基础,有助于进行基因组监测、接触者追踪以及评估新变种的出现和传播。然而,新冠病毒的系统发育分析通常依赖于为系统发育推断设计的工具,在这种工具中,所有数据在任何分析进行之前就已收集,并且系统发育是从头开始一次性推断出来的。新冠病毒数据集并不符合这种模式。目前在线数据库中有超过1000万个已测序的新冠病毒基因组,每天还会新增数以万计的新基因组。持续的数据收集,再加上新冠病毒与公共卫生的相关性,促使采用一种“在线”的系统发育学方法,即每天将新样本添加到现有的系统发育树中。新冠病毒基因组的极高密度采样也促使人们对系统发育推断的似然法和简约法进行比较。当单个分支上的单个位点发生多次变化时,最大似然(ML)方法更准确,但这种准确性是以巨大的计算成本为代价的,而且新冠病毒基因组的高密度采样意味着这些情况将极其罕见,因为每个内部分支预计都非常短。因此,基于最大简约(MP)的方法可能对于重建新冠病毒的系统发育足够准确且其简单性意味着它们可以应用于大得多的数据集。在这里,我们评估了在线系统发育方法以及ML和MP框架在推断大型且密集的新冠病毒系统发育方面的性能。总体而言,我们发现对于新冠病毒,在线系统发育学产生的系统发育树与传统分析产生的相似,并且MP优化产生的新冠病毒系统发育树比ML优化产生的更准确。由于MP比目前可用的ML实现快数千倍且在线系统发育学比传统方法更快,因此我们建议,在新冠病毒全面基因组流行病学的背景下,应优先采用MP在线系统发育学方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e20/9128781/dd35ebd11ee4/nihpp-2021.12.02.471004v2-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e20/9128781/4eb48808a497/nihpp-2021.12.02.471004v2-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e20/9128781/7a549164bf1e/nihpp-2021.12.02.471004v2-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e20/9128781/f491547b3ccc/nihpp-2021.12.02.471004v2-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e20/9128781/8ce6ba627445/nihpp-2021.12.02.471004v2-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e20/9128781/dd35ebd11ee4/nihpp-2021.12.02.471004v2-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e20/9128781/4eb48808a497/nihpp-2021.12.02.471004v2-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e20/9128781/7a549164bf1e/nihpp-2021.12.02.471004v2-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e20/9128781/f491547b3ccc/nihpp-2021.12.02.471004v2-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e20/9128781/8ce6ba627445/nihpp-2021.12.02.471004v2-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e20/9128781/dd35ebd11ee4/nihpp-2021.12.02.471004v2-f0005.jpg

相似文献

1
Online Phylogenetics using Parsimony Produces Slightly Better Trees and is Dramatically More Efficient for Large SARS-CoV-2 Phylogenies than and Maximum-Likelihood Approaches.使用简约法的在线系统发育分析生成的树稍好一些,并且对于大型严重急性呼吸综合征冠状病毒2(SARS-CoV-2)系统发育分析而言,其效率比距离法和最大似然法显著更高。
bioRxiv. 2022 May 18:2021.12.02.471004. doi: 10.1101/2021.12.02.471004.
2
Online Phylogenetics with matOptimize Produces Equivalent Trees and is Dramatically More Efficient for Large SARS-CoV-2 Phylogenies than de novo and Maximum-Likelihood Implementations.在线系统发育学与 matOptimize 产生等效的树,并且比从头开始和最大似然实现对大型 SARS-CoV-2 系统发育更有效率。
Syst Biol. 2023 Nov 1;72(5):1039-1051. doi: 10.1093/sysbio/syad031.
3
Taxonium, a web-based tool for exploring large phylogenetic trees.Taxonium,一个用于探索大型系统发育树的网络工具。
Elife. 2022 Nov 15;11:e82392. doi: 10.7554/eLife.82392.
4
Maximum likelihood pandemic-scale phylogenetics.最大似然法大流行规模系统发育学。
Nat Genet. 2023 May;55(5):746-752. doi: 10.1038/s41588-023-01368-0. Epub 2023 Apr 10.
5
TopHap: rapid inference of key phylogenetic structures from common haplotypes in large genome collections with limited diversity.TopHap:从具有有限多样性的大型基因组集中的常见单倍型中快速推断关键系统发育结构。
Bioinformatics. 2022 May 13;38(10):2719-2726. doi: 10.1093/bioinformatics/btac186.
6
Robust expansion of phylogeny for fast-growing genome sequence data.快速增长的基因组序列数据的系统发育稳健扩展。
PLoS Comput Biol. 2024 Feb 8;20(2):e1011871. doi: 10.1371/journal.pcbi.1011871. eCollection 2024 Feb.
7
Pandemic-scale phylogenetics.大流行规模的系统发育学。
bioRxiv. 2021 Dec 6:2021.12.03.470766. doi: 10.1101/2021.12.03.470766.
8
Maximum likelihood pandemic-scale phylogenetics.最大似然法大流行规模系统发育学
bioRxiv. 2022 Jul 18:2022.03.22.485312. doi: 10.1101/2022.03.22.485312.
9
Species trees from consensus single nucleotide polymorphism (SNP) data: Testing phylogenetic approaches with simulated and empirical data.基于一致性单核苷酸多态性(SNP)数据构建的物种树:使用模拟数据和实证数据检验系统发育方法。
Mol Phylogenet Evol. 2017 Nov;116:192-201. doi: 10.1016/j.ympev.2017.07.018. Epub 2017 Jul 22.
10
Heterotachy and long-branch attraction in phylogenetics.系统发育学中的异速进化和长枝吸引
BMC Evol Biol. 2005 Oct 6;5:50. doi: 10.1186/1471-2148-5-50.

引用本文的文献

1
Variant-specific introduction and dispersal dynamics of SARS-CoV-2 in New York City - from Alpha to Omicron.新冠病毒变异株在纽约市的特异性引入和传播动态——从阿尔法到奥密克戎。
PLoS Pathog. 2023 Apr 18;19(4):e1011348. doi: 10.1371/journal.ppat.1011348. eCollection 2023 Apr.

本文引用的文献

1
DecentTree: scalable Neighbour-Joining for the genomic era.DecentTree:基因组时代可扩展的近邻连接算法。
Bioinformatics. 2023 Sep 2;39(9). doi: 10.1093/bioinformatics/btad536.
2
Maximum likelihood pandemic-scale phylogenetics.最大似然法大流行规模系统发育学。
Nat Genet. 2023 May;55(5):746-752. doi: 10.1038/s41588-023-01368-0. Epub 2023 Apr 10.
3
Pandemic-scale phylogenomics reveals the SARS-CoV-2 recombination landscape.大流行规模的系统发生基因组学揭示了 SARS-CoV-2 的重组景观。
Nature. 2022 Sep;609(7929):994-997. doi: 10.1038/s41586-022-05189-9. Epub 2022 Aug 11.
4
matOptimize: a parallel tree optimization method enables online phylogenetics for SARS-CoV-2.matOptimize:一种并行树优化方法,支持 SARS-CoV-2 的在线系统发生分析。
Bioinformatics. 2022 Aug 2;38(15):3734-3740. doi: 10.1093/bioinformatics/btac401.
5
phastSim: Efficient simulation of sequence evolution for pandemic-scale datasets.phastSim:用于大流行规模数据集的序列进化的高效模拟。
PLoS Comput Biol. 2022 Apr 29;18(4):e1010056. doi: 10.1371/journal.pcbi.1010056. eCollection 2022 Apr.
6
Genomic Sequencing of SARS-CoV-2 E484K Variant B.1.243.1, Arizona, USA.美国亚利桑那州新冠病毒SARS-CoV-2 E484K变异株B.1.243.1的基因组测序
Emerg Infect Dis. 2021 Oct;27(10):2718-2720. doi: 10.3201/eid2710.211189.
7
Generation and transmission of interlineage recombinants in the SARS-CoV-2 pandemic.SARS-CoV-2 大流行中谱系间重组的产生和传播。
Cell. 2021 Sep 30;184(20):5179-5188.e8. doi: 10.1016/j.cell.2021.08.014. Epub 2021 Aug 17.
8
A Daily-Updated Database and Tools for Comprehensive SARS-CoV-2 Mutation-Annotated Trees.每日更新的 SARS-CoV-2 突变注释树综合数据库和工具。
Mol Biol Evol. 2021 Dec 9;38(12):5819-5824. doi: 10.1093/molbev/msab264.
9
Emergence and expansion of SARS-CoV-2 B.1.526 after identification in New York.在纽约发现后,SARS-CoV-2 B.1.526 的出现和传播。
Nature. 2021 Sep;597(7878):703-708. doi: 10.1038/s41586-021-03908-2. Epub 2021 Aug 24.
10
N501Y mutation of spike protein in SARS-CoV-2 strengthens its binding to receptor ACE2.SARS-CoV-2 刺突蛋白的 N501Y 突变增强了其与受体 ACE2 的结合。
Elife. 2021 Aug 20;10:e69091. doi: 10.7554/eLife.69091.