Suppr超能文献

重力-V2:一种基于实际情况的病毒分类学应用程序。

GRAViTy-V2: a grounded viral taxonomy application.

作者信息

Mayne Richard, Aiewsakun Pakorn, Turner Dann, Adriaenssens Evelien M, Simmonds Peter

机构信息

Peter Medawar Building for Pathogen Research, Nuffield Department of Medicine, University of Oxford, 3 South Parks Road, OX1 3SY Oxfordshire, UK.

Department of Microbiology, Faculty of Science, Mahidol University, 272 Rama VI Road, Thung Phaya Thai, Ratchathewi, Bangkok 10400, Thailand.

出版信息

NAR Genom Bioinform. 2024 Dec 18;6(4):lqae183. doi: 10.1093/nargab/lqae183. eCollection 2024 Dec.

Abstract

Taxonomic classification of viruses is essential for understanding their evolution. Genomic classification of viruses at higher taxonomic ranks, such as order or phylum, is typically based on alignment and comparison of amino acid sequence motifs in conserved genes. Classification at lower taxonomic ranks, such as genus or species, is usually based on nucleotide sequence identities between genomic sequences. Building on our whole-genome analytical classification framework, we here describe Genome Relationships Applied to Viral Taxonomy Version 2 (GRAViTy-V2), which encompasses a greatly expanded range of features and numerous optimisations, packaged as an application that may be used as a general-purpose virus classification tool. Using 28 datasets derived from the ICTV 2022 taxonomy proposals, GRAViTy-V2 output was compared against human expert-curated classifications used for assignments in the 2023 round of ICTV taxonomy changes. GRAViTy-V2 produced taxonomies equivalent to manually-curated versions down to the family level and in almost all cases, to genus and species levels. The majority of discrepant results arose from errors in coding sequence annotations in INDSC records, or from inclusion of incomplete genome sequences in the analysis. Analysis times ranged from 1-506 min (median 3.59) on datasets with 17-1004 genomes and mean genome length of 3000-1 000 000 bases.

摘要

病毒的分类对于理解其进化至关重要。病毒在较高分类等级(如目或门)的基因组分类通常基于保守基因中氨基酸序列基序的比对和比较。在较低分类等级(如属或种)的分类通常基于基因组序列之间的核苷酸序列同一性。基于我们的全基因组分析分类框架,我们在此描述应用于病毒分类学版本2的基因组关系(GRAViTy-V2),它包含了大大扩展的特征范围和众多优化,打包为一个可作为通用病毒分类工具使用的应用程序。使用从国际病毒分类委员会(ICTV)2022年分类学提案中获得的28个数据集,将GRAViTy-V2的输出与用于2023年ICTV分类学变更轮次分配的人类专家精心策划的分类进行比较。GRAViTy-V2生成的分类在科级水平及几乎所有情况下在属和种水平上都等同于人工精心策划的版本。大多数不一致的结果源于INDSC记录中编码序列注释的错误,或分析中包含不完整的基因组序列。在具有17 - 1004个基因组且平均基因组长度为3000 - 1000000个碱基的数据集上,分析时间范围为1 - 506分钟(中位数为3.59)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a08a/11655284/d34ef67cf346/lqae183fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验