• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

VirTAXA:利用远程同源搜索和基于树的验证增强 RNA 病毒分类学分类。

VirTAXA: enhancing RNA virus taxonomic classification with remote homology search and tree-based validation.

机构信息

Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, 999077, China SAR.

出版信息

Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae575.

DOI:10.1093/bioinformatics/btae575
PMID:39325874
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11464415/
Abstract

SUMMARY

RNA viruses are ubiquitous across a broad spectrum of ecosystems. Therefore, beyond their significant implications for public health, RNA viruses are also key players in ecological processes. High-through sequencing has accelerated the discovery of RNA viruses. Nevertheless, many of these viruses lack taxonomic annotation, posing a challenge to functional inference and evolutionary study. In particular, virus classification at the genus level remains difficult due to the limited reference data and ambiguous boundaries between some closely related genera. We introduce VirTAXA, a robust classification tool that combines remote homology search and tree-based validation to enhance the genus-level taxonomic classification of RNA viruses. VirTAXA is able to predict the genus label of an assembled viral contig and provide evidence type for each prediction. It achieves comparable accuracy to state-of-the-art methods while assigning genus labels to a greater number of sequences. Specifically, on the Global Ocean RNA metatranscriptomic data, VirTAXA can assign genus labels for 18% more contigs than the second-best classification tool. Furthermore, we demonstrated that VirTAXA can be conveniently extended to other types of viruses.

AVAILABILITY AND IMPLEMENTATION

The source code and data of VirTAXA are available via https://github.com/JudithEllyn/VirTAXA.

摘要

摘要

RNA 病毒广泛存在于各种生态系统中。因此,除了对公共卫生有重大影响外,RNA 病毒还是生态过程中的关键参与者。高通量测序加速了 RNA 病毒的发现。然而,许多这些病毒缺乏分类注释,这对功能推断和进化研究构成了挑战。特别是,由于有限的参考数据和一些密切相关属之间的边界模糊,病毒属级别的分类仍然具有挑战性。我们引入了 VirTAXA,这是一种强大的分类工具,它结合了远程同源搜索和基于树的验证,以增强 RNA 病毒的属级分类。VirTAXA 能够预测组装病毒序列的属标签,并为每个预测提供证据类型。它在分配属标签方面的准确性可与最先进的方法相媲美,同时可以为更多的序列分配属标签。具体来说,在全球海洋 RNA 宏转录组数据上,VirTAXA 可以为比第二个最佳分类工具多 18%的序列分配属标签。此外,我们证明了 VirTAXA 可以方便地扩展到其他类型的病毒。

可用性和实现

VirTAXA 的源代码和数据可通过 https://github.com/JudithEllyn/VirTAXA 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/80e3/11464415/4da9f71ab8dd/btae575f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/80e3/11464415/4da9f71ab8dd/btae575f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/80e3/11464415/4da9f71ab8dd/btae575f1.jpg

相似文献

1
VirTAXA: enhancing RNA virus taxonomic classification with remote homology search and tree-based validation.VirTAXA:利用远程同源搜索和基于树的验证增强 RNA 病毒分类学分类。
Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae575.
2
A binning tool to reconstruct viral haplotypes from assembled contigs.一种从组装的连续序列中重建病毒单倍型的分箱工具。
BMC Bioinformatics. 2019 Nov 4;20(1):544. doi: 10.1186/s12859-019-3138-1.
3
CHEER: HierarCHical taxonomic classification for viral mEtagEnomic data via deep leaRning.通过深度学习的病毒宏基因组数据的层次分类学分类。
Methods. 2021 May;189:95-103. doi: 10.1016/j.ymeth.2020.05.018. Epub 2020 May 23.
4
Cataloguing the taxonomic origins of sequences from a heterogeneous sample using phylogenomics: applications in adventitious agent detection.利用系统发育基因组学对异质样本中序列的分类学起源进行编目:在检测外来因子中的应用。
PDA J Pharm Sci Technol. 2014 Nov-Dec;68(6):602-18. doi: 10.5731/pdajpst.2014.01023.
5
RNAVirHost: a machine learning-based method for predicting hosts of RNA viruses through viral genomes.RNAVirHost:一种基于机器学习的方法,通过病毒基因组预测 RNA 病毒的宿主。
Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae059.
6
SegVir: Reconstruction of Complete Segmented RNA Viral Genomes from Metatranscriptomes.SegVir:从宏转录组中重建完整的分段 RNA 病毒基因组。
Mol Biol Evol. 2024 Aug 2;41(8). doi: 10.1093/molbev/msae171.
7
Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut.比较不同的组装和注释工具在分析肠道中模拟病毒宏基因组群落中的应用。
BMC Genomics. 2014 Jan 18;15:37. doi: 10.1186/1471-2164-15-37.
8
TAR-VIR: a pipeline for TARgeted VIRal strain reconstruction from metagenomic data.TAR-VIR:一种从宏基因组数据中重建 TARgeted VIRal 株的管道。
BMC Bioinformatics. 2019 Jun 4;20(1):305. doi: 10.1186/s12859-019-2878-2.
9
Phytovirome Analysis of Wild Plant Populations: Comparison of Double-Stranded RNA and Virion-Associated Nucleic Acid Metagenomic Approaches.植物病毒组分析野生植物群体:双链 RNA 和病毒粒子相关核酸宏基因组方法的比较。
J Virol. 2019 Dec 12;94(1). doi: 10.1128/JVI.01462-19.
10
Evolution and taxonomy of positive-strand RNA viruses: implications of comparative analysis of amino acid sequences.正链RNA病毒的进化与分类学:氨基酸序列比较分析的意义
Crit Rev Biochem Mol Biol. 1993;28(5):375-430. doi: 10.3109/10409239309078440.

本文引用的文献

1
PhaGenus: genus-level classification of bacteriophages using a Transformer model.PhaGenus:基于 Transformer 模型的噬菌体属分类
Brief Bioinform. 2023 Sep 22;24(6). doi: 10.1093/bib/bbad408.
2
A discussion of RNA virus taxonomy based on the 2020 International Committee on Taxonomy of Viruses report.基于2020年国际病毒分类委员会报告的RNA病毒分类学讨论。
Front Microbiol. 2022 Oct 14;13:960465. doi: 10.3389/fmicb.2022.960465. eCollection 2022.
3
Cryptic and abundant marine viruses at the evolutionary origins of Earth's RNA virome.
地球 RNA 病毒组进化起源处的隐匿且丰富的海洋病毒。
Science. 2022 Apr 8;376(6589):156-162. doi: 10.1126/science.abm5847. Epub 2022 Apr 7.
4
Constructing benchmark test sets for biological sequence analysis using independent set algorithms.使用独立集算法构建生物序列分析的基准测试集。
PLoS Comput Biol. 2022 Mar 7;18(3):e1009492. doi: 10.1371/journal.pcbi.1009492. eCollection 2022 Mar.
5
Sensitive protein alignments at tree-of-life scale using DIAMOND.使用 DIAMOND 进行生命之树尺度上的敏感蛋白质比对。
Nat Methods. 2021 Apr;18(4):366-368. doi: 10.1038/s41592-021-01101-x. Epub 2021 Apr 7.
6
Fast and sensitive taxonomic assignment to metagenomic contigs.快速而敏感的宏基因组序列分类学分配。
Bioinformatics. 2021 Sep 29;37(18):3029-3031. doi: 10.1093/bioinformatics/btab184.
7
Ultrafast and accurate 16S rRNA microbial community analysis using Kraken 2.使用 Kraken 2 进行快速准确的 16S rRNA 微生物群落分析。
Microbiome. 2020 Aug 28;8(1):124. doi: 10.1186/s40168-020-00900-2.
8
Transcriptomic characteristics of bronchoalveolar lavage fluid and peripheral blood mononuclear cells in COVID-19 patients.新型冠状病毒肺炎患者支气管肺泡灌洗液和外周血单个核细胞的转录组学特征。
Emerg Microbes Infect. 2020 Dec;9(1):761-770. doi: 10.1080/22221751.2020.1747363.
9
Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT.使用 CAT 和 BAT 对未知微生物序列和菌群进行稳健的分类学分类。
Genome Biol. 2019 Oct 22;20(1):217. doi: 10.1186/s13059-019-1817-x.
10
TreeShrink: fast and accurate detection of outlier long branches in collections of phylogenetic trees.TreeShrink:快速准确地检测系统发育树集合中的异常长分支。
BMC Genomics. 2018 May 8;19(Suppl 5):272. doi: 10.1186/s12864-018-4620-2.