• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

学习可转移的深度卷积神经网络,用于细菌毒力因子的分类。

Learning transferable deep convolutional neural networks for the classification of bacterial virulence factors.

机构信息

NHC Key Laboratory of Systems Biology of Pathogens, Institute of Pathogen Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100176, China.

Australian Institute for Machine Learning, The University of Adelaide, Adelaide, SA 5005, Australia.

出版信息

Bioinformatics. 2020 Jun 1;36(12):3693-3702. doi: 10.1093/bioinformatics/btaa230.

DOI:10.1093/bioinformatics/btaa230
PMID:32251507
Abstract

MOTIVATION

Identification of virulence factors (VFs) is critical to the elucidation of bacterial pathogenesis and prevention of related infectious diseases. Current computational methods for VF prediction focus on binary classification or involve only several class(es) of VFs with sufficient samples. However, thousands of VF classes are present in real-world scenarios, and many of them only have a very limited number of samples available.

RESULTS

We first construct a large VF dataset, covering 3446 VF classes with 160 495 sequences, and then propose deep convolutional neural network models for VF classification. We show that (i) for common VF classes with sufficient samples, our models can achieve state-of-the-art performance with an overall accuracy of 0.9831 and an F1-score of 0.9803; (ii) for uncommon VF classes with limited samples, our models can learn transferable features from auxiliary data and achieve good performance with accuracy ranging from 0.9277 to 0.9512 and F1-score ranging from 0.9168 to 0.9446 when combined with different predefined features, outperforming traditional classifiers by 1-13% in accuracy and by 1-16% in F1-score.

AVAILABILITY AND IMPLEMENTATION

All of our datasets are made publicly available at http://www.mgc.ac.cn/VFNet/, and the source code of our models is publicly available at https://github.com/zhengdd0422/VFNet.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

鉴定毒力因子(VF)对于阐明细菌发病机制和预防相关传染病至关重要。目前用于 VF 预测的计算方法主要关注二进制分类,或者只涉及具有足够样本的几类 VF。然而,在实际情况下存在数千种 VF 类别,其中许多类别的样本数量非常有限。

结果

我们首先构建了一个大型 VF 数据集,涵盖 3446 个 VF 类别,共 160495 个序列,然后提出了用于 VF 分类的深度卷积神经网络模型。我们表明:(i)对于具有足够样本的常见 VF 类别,我们的模型可以达到最先进的性能,总体准确率为 0.9831,F1 得分为 0.9803;(ii)对于具有有限样本的罕见 VF 类别,我们的模型可以从辅助数据中学习可转移的特征,并通过与不同预定义特征相结合,实现准确率在 0.9277 到 0.9512 之间、F1 得分在 0.9168 到 0.9446 之间的良好性能,在准确率方面比传统分类器提高 1-13%,在 F1 得分方面提高 1-16%。

可用性和实现

我们的所有数据集均在 http://www.mgc.ac.cn/VFNet/ 上公开提供,模型的源代码在 https://github.com/zhengdd0422/VFNet 上公开提供。

补充信息

补充数据可在生物信息学在线获取。

相似文献

1
Learning transferable deep convolutional neural networks for the classification of bacterial virulence factors.学习可转移的深度卷积神经网络,用于细菌毒力因子的分类。
Bioinformatics. 2020 Jun 1;36(12):3693-3702. doi: 10.1093/bioinformatics/btaa230.
2
DeepVF: a deep learning-based hybrid framework for identifying virulence factors using the stacking strategy.DeepVF:一种基于深度学习的混合框架,使用堆叠策略识别毒力因子。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa125.
3
PlasGUN: gene prediction in plasmid metagenomic short reads using deep learning.PlasGUN:使用深度学习进行质粒宏基因组短读测序中的基因预测。
Bioinformatics. 2020 May 1;36(10):3239-3241. doi: 10.1093/bioinformatics/btaa103.
4
Dataset-aware multi-task learning approaches for biomedical named entity recognition.基于数据集的多任务学习方法在生物医学命名实体识别中的应用。
Bioinformatics. 2020 Aug 1;36(15):4331-4338. doi: 10.1093/bioinformatics/btaa515.
5
Chromatin accessibility prediction via convolutional long short-term memory networks with k-mer embedding.基于 k- -mer 嵌入卷积长短期记忆网络的染色质可及性预测。
Bioinformatics. 2017 Jul 15;33(14):i92-i101. doi: 10.1093/bioinformatics/btx234.
6
DeepPhos: prediction of protein phosphorylation sites with deep learning.DeepPhos:利用深度学习预测蛋白质磷酸化位点
Bioinformatics. 2019 Aug 15;35(16):2766-2773. doi: 10.1093/bioinformatics/bty1051.
7
DeepT3: deep convolutional neural networks accurately identify Gram-negative bacterial type III secreted effectors using the N-terminal sequence.DeepT3:使用 N 端序列,深度卷积神经网络准确识别革兰氏阴性菌 III 型分泌效应物。
Bioinformatics. 2019 Jun 1;35(12):2051-2057. doi: 10.1093/bioinformatics/bty931.
8
Transfer learning for biomedical named entity recognition with neural networks.基于神经网络的生物医学命名实体识别的迁移学习。
Bioinformatics. 2018 Dec 1;34(23):4087-4094. doi: 10.1093/bioinformatics/bty449.
9
DeepMito: accurate prediction of protein sub-mitochondrial localization using convolutional neural networks.DeepMito:使用卷积神经网络准确预测蛋白质亚线粒体定位
Bioinformatics. 2020 Jan 1;36(1):56-64. doi: 10.1093/bioinformatics/btz512.
10
Protein-protein interaction site prediction through combining local and global features with deep neural networks.通过结合局部和全局特征与深度神经网络进行蛋白质-蛋白质相互作用位点预测。
Bioinformatics. 2020 Feb 15;36(4):1114-1120. doi: 10.1093/bioinformatics/btz699.

引用本文的文献

1
Immunosenescence: How Aging Increases Susceptibility to Bacterial Infections and Virulence Factors.免疫衰老:衰老如何增加对细菌感染和毒力因子的易感性。
Microorganisms. 2024 Oct 11;12(10):2052. doi: 10.3390/microorganisms12102052.
2
Annotation of Functions of Sequences of Concern and Its Relevance to the New Biosecurity Regulatory Framework in the United States.关注序列功能注释及其与美国新生物安全监管框架的相关性
Appl Biosaf. 2024 Sep 18;29(3):142-149. doi: 10.1089/apb.2023.0030. eCollection 2024 Sep.
3
Highly accurate classification and discovery of microbial protein-coding gene functions using FunGeneTyper: an extensible deep learning framework.
使用 FunGeneTyper 实现微生物蛋白编码基因功能的高精度分类和发现:一个可扩展的深度学习框架。
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae319.
4
A deep learning method to predict bacterial ADP-ribosyltransferase toxins.一种预测细菌 ADP-ribosyltransferase 毒素的深度学习方法。
Bioinformatics. 2024 Jul 1;40(7). doi: 10.1093/bioinformatics/btae378.
5
Identification of small molecules affecting the interaction between human hemoglobin and Staphylococcus aureus IsdB hemophore.鉴定小分子对人血红蛋白与金黄色葡萄球菌 IsdB 血红素结合蛋白相互作用的影响。
Sci Rep. 2024 Apr 9;14(1):8272. doi: 10.1038/s41598-024-55931-8.
6
RVdb: a comprehensive resource and analysis platform for rhinovirus research.RVdb:鼻病毒研究的综合资源和分析平台。
Nucleic Acids Res. 2024 Jan 5;52(D1):D770-D776. doi: 10.1093/nar/gkad937.
7
A novel lytic bacteriophage against colistin-resistant Escherichia coli isolated from different animals.一种从不同动物中分离到的针对多粘菌素耐药大肠杆菌的新型裂解噬菌体。
Virus Res. 2023 May;329:199090. doi: 10.1016/j.virusres.2023.199090. Epub 2023 Mar 28.
8
VFDB 2022: a general classification scheme for bacterial virulence factors.VFDB 2022:细菌毒力因子的通用分类方案。
Nucleic Acids Res. 2022 Jan 7;50(D1):D912-D917. doi: 10.1093/nar/gkab1107.
9
DeePhage: distinguishing virulent and temperate phage-derived sequences in metavirome data with a deep learning approach.DeePhage:使用深度学习方法在宏病毒组数据中区分毒性噬菌体和温和噬菌体序列。
Gigascience. 2021 Sep 8;10(9). doi: 10.1093/gigascience/giab056.