Department of Computer Science, Virginia Tech, Blacksburg, VA, USA.
Department of Civil and Environmental Engineering, Virginia Tech, Blacksburg, VA, USA.
Microbiome. 2018 Feb 1;6(1):23. doi: 10.1186/s40168-018-0401-z.
BACKGROUND: Growing concerns about increasing rates of antibiotic resistance call for expanded and comprehensive global monitoring. Advancing methods for monitoring of environmental media (e.g., wastewater, agricultural waste, food, and water) is especially needed for identifying potential resources of novel antibiotic resistance genes (ARGs), hot spots for gene exchange, and as pathways for the spread of ARGs and human exposure. Next-generation sequencing now enables direct access and profiling of the total metagenomic DNA pool, where ARGs are typically identified or predicted based on the "best hits" of sequence searches against existing databases. Unfortunately, this approach produces a high rate of false negatives. To address such limitations, we propose here a deep learning approach, taking into account a dissimilarity matrix created using all known categories of ARGs. Two deep learning models, DeepARG-SS and DeepARG-LS, were constructed for short read sequences and full gene length sequences, respectively. RESULTS: Evaluation of the deep learning models over 30 antibiotic resistance categories demonstrates that the DeepARG models can predict ARGs with both high precision (> 0.97) and recall (> 0.90). The models displayed an advantage over the typical best hit approach, yielding consistently lower false negative rates and thus higher overall recall (> 0.9). As more data become available for under-represented ARG categories, the DeepARG models' performance can be expected to be further enhanced due to the nature of the underlying neural networks. Our newly developed ARG database, DeepARG-DB, encompasses ARGs predicted with a high degree of confidence and extensive manual inspection, greatly expanding current ARG repositories. CONCLUSIONS: The deep learning models developed here offer more accurate antimicrobial resistance annotation relative to current bioinformatics practice. DeepARG does not require strict cutoffs, which enables identification of a much broader diversity of ARGs. The DeepARG models and database are available as a command line version and as a Web service at http://bench.cs.vt.edu/deeparg .
背景:人们越来越关注抗生素耐药率的上升,因此需要扩大和综合的全球监测。特别需要改进环境介质(例如废水、农业废物、食品和水)监测方法,以确定新抗生素耐药基因(ARGs)的潜在资源、基因交换的热点,以及 ARGs 和人类暴露的传播途径。下一代测序现在可以直接访问和分析总宏基因组 DNA 池,通常根据序列搜索与现有数据库的“最佳命中”来识别或预测 ARGs。不幸的是,这种方法会产生很高的假阴性率。为了解决这些限制,我们在这里提出了一种深度学习方法,该方法考虑了使用所有已知 ARG 类别创建的相似度矩阵。构建了两个深度学习模型,即 DeepARG-SS 和 DeepARG-LS,分别用于短读序列和全长基因序列。
结果:对 30 多种抗生素耐药类别进行的深度学习模型评估表明,DeepARG 模型可以以高精确度(>0.97)和召回率(>0.90)预测 ARGs。与典型的最佳命中方法相比,该模型具有优势,产生的假阴性率始终较低,因此整体召回率较高(>0.9)。随着更多数据可用于代表性不足的 ARG 类别,由于基础神经网络的性质,DeepARG 模型的性能预计将进一步提高。我们新开发的 DeepARG-DB 数据库包含经过高度置信度预测和广泛手动检查的 ARGs,极大地扩展了当前的 ARG 存储库。
结论:与当前的生物信息学实践相比,这里开发的深度学习模型提供了更准确的抗菌药物耐药性注释。DeepARG 不需要严格的截止值,这使得能够识别出更广泛的 ARG 多样性。DeepARG 模型和数据库可作为命令行版本以及 Web 服务在 http://bench.cs.vt.edu/deeparg 上使用。
World J Methodol. 2025-12-20
BMC Bioinformatics. 2025-9-1
Int J Mol Sci. 2025-7-16
Bioinform Adv. 2025-6-26
NPJ Syst Biol Appl. 2025-7-7
J Antimicrob Chemother. 2017-10-1
BMC Bioinformatics. 2017-2-28
PLoS Comput Biol. 2017-2-24
Nat Methods. 2017-2-20
J Neural Eng. 2017-2
Nucleic Acids Res. 2017-1-4
Ann N Y Acad Sci. 2016-11-22