• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

VirDetect-AI:一种基于残差和卷积神经网络的宏基因组工具,用于真核病毒蛋白鉴定。

VirDetect-AI: a residual and convolutional neural network-based metagenomic tool for eukaryotic viral protein identification.

作者信息

Zárate Alida, Díaz-González Lorena, Taboada Blanca

机构信息

Doctorado en Ciencias, Instituto de Investigación en Ciencias Básicas Aplicadas (IICBA), Universidad Autónoma del Estado de Morelos, Cuernavaca, Morelos 62210, México.

Centro de Investigación en Ciencias, Universidad Autónoma del Estado de Morelos, Cuernavaca, Morelos 62210, México.

出版信息

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf001.

DOI:10.1093/bib/bbaf001
PMID:39808116
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11729733/
Abstract

This study addresses the challenging task of identifying viruses within metagenomic data, which encompasses a broad array of biological samples, including animal reservoirs, environmental sources, and the human body. Traditional methods for virus identification often face limitations due to the diversity and rapid evolution of viral genomes. In response, recent efforts have focused on leveraging artificial intelligence (AI) techniques to enhance accuracy and efficiency in virus detection. However, existing AI-based approaches are primarily binary classifiers, lacking specificity in identifying viral types and reliant on nucleotide sequences. To address these limitations, VirDetect-AI, a novel tool specifically designed for the identification of eukaryotic viruses within metagenomic datasets, is introduced. The VirDetect-AI model employs a combination of convolutional neural networks and residual neural networks to effectively extract hierarchical features and detailed patterns from complex amino acid genomic data. The results demonstrated that the model has outstanding results in all metrics, with a sensitivity of 0.97, a precision of 0.98, and an F1-score of 0.98. VirDetect-AI improves our comprehension of viral ecology and can accurately classify metagenomic sequences into 980 viral protein classes, hence enabling the identification of new viruses. These classes encompass an extensive array of viral genera and families, as well as protein functions and hosts.

摘要

本研究致力于解决在宏基因组数据中识别病毒这一具有挑战性的任务,宏基因组数据涵盖了广泛的生物样本,包括动物宿主、环境来源和人体。由于病毒基因组的多样性和快速进化,传统的病毒识别方法往往面临局限性。作为回应,最近的努力集中在利用人工智能(AI)技术来提高病毒检测的准确性和效率。然而,现有的基于AI的方法主要是二元分类器,在识别病毒类型方面缺乏特异性,并且依赖于核苷酸序列。为了解决这些局限性,我们引入了VirDetect-AI,这是一种专门设计用于在宏基因组数据集中识别真核病毒的新型工具。VirDetect-AI模型采用卷积神经网络和残差神经网络的组合,从复杂的氨基酸基因组数据中有效地提取层次特征和详细模式。结果表明,该模型在所有指标上都取得了出色的成绩,灵敏度为0.97,精确率为0.98,F1分数为0.98。VirDetect-AI提高了我们对病毒生态学的理解,并能将宏基因组序列准确分类为980个病毒蛋白类别,从而能够识别新病毒。这些类别涵盖了广泛的病毒属和科,以及蛋白质功能和宿主。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/e3c129fafd74/bbaf001f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/804c367b0534/bbaf001f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/09ea4f8a92a2/bbaf001f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/ea2b794ab96b/bbaf001f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/6b3bd613f475/bbaf001f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/1e9e4023b32d/bbaf001f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/64aefa6159c2/bbaf001f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/698bc3e2f97c/bbaf001f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/cda73864df2a/bbaf001f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/e3c129fafd74/bbaf001f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/804c367b0534/bbaf001f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/09ea4f8a92a2/bbaf001f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/ea2b794ab96b/bbaf001f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/6b3bd613f475/bbaf001f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/1e9e4023b32d/bbaf001f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/64aefa6159c2/bbaf001f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/698bc3e2f97c/bbaf001f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/cda73864df2a/bbaf001f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf7a/11729733/e3c129fafd74/bbaf001f9.jpg

相似文献

1
VirDetect-AI: a residual and convolutional neural network-based metagenomic tool for eukaryotic viral protein identification.VirDetect-AI:一种基于残差和卷积神经网络的宏基因组工具,用于真核病毒蛋白鉴定。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf001.
2
ViBE: a hierarchical BERT model to identify eukaryotic viruses using metagenome sequencing data.ViBE:一种基于层次 BERT 模型的方法,用于利用宏基因组测序数据识别真核病毒。
Brief Bioinform. 2022 Jul 18;23(4). doi: 10.1093/bib/bbac204.
3
The genomic underpinnings of eukaryotic virus taxonomy: creating a sequence-based framework for family-level virus classification.真核病毒分类学的基因组基础:创建基于序列的病毒科分类框架。
Microbiome. 2018 Feb 20;6(1):38. doi: 10.1186/s40168-018-0422-7.
4
VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data.VirFinder:一种新型的基于 k-mer 的工具,用于从组装的宏基因组数据中识别病毒序列。
Microbiome. 2017 Jul 6;5(1):69. doi: 10.1186/s40168-017-0283-5.
5
ViraMiner: Deep learning on raw DNA sequences for identifying viral genomes in human samples.ViraMiner:在原始 DNA 序列上进行深度学习,以鉴定人类样本中的病毒基因组。
PLoS One. 2019 Sep 11;14(9):e0222271. doi: 10.1371/journal.pone.0222271. eCollection 2019.
6
VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences.VIBRANT:从基因组序列中自动恢复、注释和培养微生物病毒,并评估病毒群落功能。
Microbiome. 2020 Jun 10;8(1):90. doi: 10.1186/s40168-020-00867-0.
7
HVSeeker: a deep-learning-based method for identification of host and viral DNA sequences.HVSeeker:一种基于深度学习的宿主和病毒DNA序列识别方法。
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf037.
8
ViralRecall-A Flexible Command-Line Tool for the Detection of Giant Virus Signatures in 'Omic Data.病毒召回——一种用于在‘组学数据中检测巨型病毒特征的灵活命令行工具。
Viruses. 2021 Jan 20;13(2):150. doi: 10.3390/v13020150.
9
Metagenome sequence data mining for viral interaction studies: Review on progress and prospects.宏基因组序列数据分析在病毒相互作用研究中的应用:进展与展望综述。
Virus Res. 2024 Nov;349:199450. doi: 10.1016/j.virusres.2024.199450. Epub 2024 Aug 21.
10
Exploring deep learning in phage discovery and characterization.探索深度学习在噬菌体发现与表征中的应用。
Virology. 2025 Aug;609:110559. doi: 10.1016/j.virol.2025.110559. Epub 2025 Apr 29.

本文引用的文献

1
Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels.用于训练带有噪声标签的深度神经网络的广义交叉熵损失
Adv Neural Inf Process Syst. 2018 Dec;32:8792-8802. Epub 2018 Dec 3.
2
Metagenomics: An Effective Approach for Exploring Microbial Diversity and Functions.宏基因组学:探索微生物多样性与功能的有效方法。
Foods. 2023 May 25;12(11):2140. doi: 10.3390/foods12112140.
3
The fecal and oropharyngeal eukaryotic viromes of healthy infants during the first year of life are personal.健康婴儿在生命的第一年中粪便和口咽真核病毒组具有个体特异性。
Sci Rep. 2023 Jan 17;13(1):938. doi: 10.1038/s41598-022-26707-9.
4
KMCP: accurate metagenomic profiling of both prokaryotic and viral populations by pseudo-mapping.KMCP:通过伪映射对原核生物和病毒种群进行准确的宏基因组分析。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac845.
5
The conserved domain database in 2023.2023 年的保守域数据库。
Nucleic Acids Res. 2023 Jan 6;51(D1):D384-D388. doi: 10.1093/nar/gkac1096.
6
Metagenomic analysis reveals differences in the co-occurrence and abundance of viral species in SARS-CoV-2 patients with different severity of disease.宏基因组分析揭示了不同严重程度的 SARS-CoV-2 患者中病毒物种的共存和丰度的差异。
BMC Infect Dis. 2022 Oct 19;22(1):792. doi: 10.1186/s12879-022-07783-8.
7
Virsearcher: Identifying Bacteriophages from Metagenomes by Combining Convolutional Neural Network and Gene Information.Virsearcher:通过卷积神经网络和基因信息相结合从宏基因组中鉴定噬菌体。
IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):763-774. doi: 10.1109/TCBB.2022.3161135. Epub 2023 Feb 3.
8
Virtifier: a deep learning-based identifier for viral sequences from metagenomes.Virtifier:一种基于深度学习的宏基因组病毒序列标识符。
Bioinformatics. 2022 Feb 7;38(5):1216-1222. doi: 10.1093/bioinformatics/btab845.
9
Database resources of the national center for biotechnology information.国家生物技术信息中心数据库资源。
Nucleic Acids Res. 2022 Jan 7;50(D1):D20-D26. doi: 10.1093/nar/gkab1112.
10
Explainable deep neural networks for novel viral genome prediction.用于新型病毒基因组预测的可解释深度神经网络。
Appl Intell (Dordr). 2022;52(3):3002-3017. doi: 10.1007/s10489-021-02572-3. Epub 2021 Jun 25.