• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

DeepCheck:多任务学习有助于评估微生物基因组质量。

DeepCheck: multitask learning aids in assessing microbial genome quality.

机构信息

State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, 163 Xianlin Avenue, Qixia District, Nanjing 210000, China.

Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, 159 Panlong road, Xuanwu District, Nanjing 210000, China.

出版信息

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae539.

DOI:10.1093/bib/bbae539
PMID:39438078
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11495869/
Abstract

Metagenomic analyses facilitate the exploration of the microbial world, advancing our understanding of microbial roles in ecological and biological processes. A pivotal aspect of metagenomic analysis involves assessing the quality of metagenome-assembled genomes (MAGs), crucial for accurate biological insights. Current machine learning-based methods often treat completeness and contamination prediction as separate tasks, overlooking their inherent relationship and limiting models' generalization. In this study, we present DeepCheck, a multitasking deep learning framework for simultaneous prediction of MAG completeness and contamination. DeepCheck consistently outperforms existing tools in accuracy across various experimental settings and demonstrates comparable speed while maintaining high predictive accuracy even for new lineages. Additionally, we employ interpretable machine learning techniques to identify specific genes and pathways that drive the model's predictions, enabling independent investigation and assessment of these biological elements for deeper insights.

摘要

宏基因组分析有助于探索微生物世界,增进我们对微生物在生态和生物过程中作用的理解。宏基因组分析的一个关键方面涉及评估宏基因组组装基因组 (MAG) 的质量,这对于准确的生物学见解至关重要。当前基于机器学习的方法通常将完整性和污染预测视为单独的任务,忽略了它们之间的内在关系,限制了模型的泛化能力。在这项研究中,我们提出了 DeepCheck,这是一种用于同时预测 MAG 完整性和污染的多任务深度学习框架。DeepCheck 在各种实验设置下的准确性均优于现有工具,并且在保持高预测准确性的同时,速度也相当快,即使对于新的谱系也是如此。此外,我们还采用了可解释的机器学习技术来识别驱动模型预测的特定基因和途径,从而能够对这些生物学元素进行独立的研究和评估,以获得更深入的见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cd/11495869/3377f82ddc7f/bbae539f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cd/11495869/b309724959d8/bbae539f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cd/11495869/2ddef26edba0/bbae539f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cd/11495869/6cb5c2c3d2cb/bbae539f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cd/11495869/8ee6c22aa315/bbae539f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cd/11495869/3377f82ddc7f/bbae539f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cd/11495869/b309724959d8/bbae539f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cd/11495869/2ddef26edba0/bbae539f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cd/11495869/6cb5c2c3d2cb/bbae539f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cd/11495869/8ee6c22aa315/bbae539f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cd/11495869/3377f82ddc7f/bbae539f5.jpg

相似文献

1
DeepCheck: multitask learning aids in assessing microbial genome quality.DeepCheck:多任务学习有助于评估微生物基因组质量。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae539.
2
CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning.CheckM2:一种使用机器学习快速、可扩展且准确评估微生物基因组质量的工具。
Nat Methods. 2023 Aug;20(8):1203-1212. doi: 10.1038/s41592-023-01940-w. Epub 2023 Jul 27.
3
Recovery of strain-resolved genomes from human microbiome through an integration framework of single-cell genomics and metagenomics.通过单细胞基因组学和宏基因组学的整合框架从人类微生物组中恢复菌株解析基因组。
Microbiome. 2021 Oct 12;9(1):202. doi: 10.1186/s40168-021-01152-4.
4
Comprehensive Functional Annotation of Metagenomes and Microbial Genomes Using a Deep Learning-Based Method.基于深度学习的宏基因组和微生物组综合功能注释。
mSystems. 2023 Apr 27;8(2):e0117822. doi: 10.1128/msystems.01178-22. Epub 2023 Mar 7.
5
The Reliability of Metagenome-Assembled Genomes (MAGs) in Representing Natural Populations: Insights from Comparing MAGs against Isolate Genomes Derived from the Same Fecal Sample.宏基因组组装基因组(MAGs)在代表自然种群方面的可靠性:来自比较源自同一粪便样本的分离基因组的 MAGs 的见解。
Appl Environ Microbiol. 2021 Feb 26;87(6). doi: 10.1128/AEM.02593-20.
6
CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes.CheckM:评估从分离株、单细胞和宏基因组中获得的微生物基因组质量。
Genome Res. 2015 Jul;25(7):1043-55. doi: 10.1101/gr.186072.114. Epub 2015 May 14.
7
Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets.评估宏基因组工具在真实宏基因组数据集和 CAMI 数据集上的基因组 binning 效果。
BMC Bioinformatics. 2020 Jul 28;21(1):334. doi: 10.1186/s12859-020-03667-3.
8
Evaluating Assembly and Binning Strategies for Time Series Drinking Water Metagenomes.评估时间序列饮用水宏基因组的组装和分类策略。
Microbiol Spectr. 2021 Dec 22;9(3):e0143421. doi: 10.1128/Spectrum.01434-21. Epub 2021 Nov 3.
9
500 metagenome-assembled microbial genomes from 30 subtropical estuaries in South China.500 个来自中国南方 30 个亚热带河口的宏基因组组装微生物基因组。
Sci Data. 2022 Jun 16;9(1):310. doi: 10.1038/s41597-022-01433-z.
10
Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights.大型宏基因组数据集的机器学习荟萃分析:工具与生物学见解
PLoS Comput Biol. 2016 Jul 11;12(7):e1004977. doi: 10.1371/journal.pcbi.1004977. eCollection 2016 Jul.

本文引用的文献

1
The biotechnological potential of the phylum.门的生物技术潜力。
Appl Environ Microbiol. 2024 Jun 18;90(6):e0175623. doi: 10.1128/aem.01756-23. Epub 2024 May 6.
2
Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark.异构网络表示学习:一个包含综述与基准测试的统一框架
IEEE Trans Knowl Data Eng. 2022 Oct;34(10):4854-4873. doi: 10.1109/tkde.2020.3045924. Epub 2020 Dec 21.
3
TCMBank: bridges between the largest herbal medicines, chemical ingredients, target proteins, and associated diseases with intelligence text mining.
中医知识库(TCMBank):通过智能文本挖掘在最大的草药、化学成分、靶蛋白及相关疾病之间搭建桥梁。
Chem Sci. 2023 Aug 8;14(39):10684-10701. doi: 10.1039/d3sc02139d. eCollection 2023 Oct 11.
4
Protein remote homology detection and structural alignment using deep learning.使用深度学习进行蛋白质远程同源检测和结构比对。
Nat Biotechnol. 2024 Jun;42(6):975-985. doi: 10.1038/s41587-023-01917-2. Epub 2023 Sep 7.
5
CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning.CheckM2:一种使用机器学习快速、可扩展且准确评估微生物基因组质量的工具。
Nat Methods. 2023 Aug;20(8):1203-1212. doi: 10.1038/s41592-023-01940-w. Epub 2023 Jul 27.
6
Comprehensive evaluation of deep and graph learning on drug-drug interactions prediction.深度和图学习在药物-药物相互作用预测中的综合评估。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad235.
7
Improving the identification of miRNA-disease associations with multi-task learning on gene-disease networks.利用基因疾病网络的多任务学习提高 miRNA-疾病关联的识别。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad203.
8
3D graph neural network with few-shot learning for predicting drug-drug interactions in scaffold-based cold start scenario.基于图神经网络的少样本学习方法预测骨架结构药物从头开始的药物相互作用。
Neural Netw. 2023 Aug;165:94-105. doi: 10.1016/j.neunet.2023.05.039. Epub 2023 May 25.
9
Explainable multi-task learning for multi-modality biological data analysis.可解释的多任务学习在多模态生物数据分析中的应用。
Nat Commun. 2023 May 3;14(1):2546. doi: 10.1038/s41467-023-37477-x.
10
Meta Learning With Graph Attention Networks for Low-Data Drug Discovery.基于图注意力网络的元学习在少数据药物发现中的应用
IEEE Trans Neural Netw Learn Syst. 2024 Aug;35(8):11218-11230. doi: 10.1109/TNNLS.2023.3250324. Epub 2024 Aug 5.