• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于图聚合嵌入的分层微生物功能预测

Hierarchical Microbial Functions Prediction by Graph Aggregated Embedding.

作者信息

Hou Yujie, Zhang Xiong, Zhou Qinyan, Hong Wenxing, Wang Ying

机构信息

Department of Automation, Xiamen University, Xiamen, China.

Department of Automation, University of Science and Technology of China, Hefei, China.

出版信息

Front Genet. 2021 Jan 18;11:608512. doi: 10.3389/fgene.2020.608512. eCollection 2020.

DOI:10.3389/fgene.2020.608512
PMID:33584804
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7874084/
Abstract

Matching 16S rRNA gene sequencing data to a metabolic reference database is a meaningful way to predict the metabolic function of bacteria and archaea, bringing greater insight to the working of the microbial community. However, some operational taxonomy units (OTUs) cannot be functionally profiled, especially for microbial communities from non-human samples cultured in defective media. Therefore, we herein report the development of Hierarchical micrObial functions Prediction by graph aggregated Embedding (HOPE), which utilizes co-occurring patterns and nucleotide sequences to predict microbial functions. HOPE integrates topological structures of microbial co-occurrence networks with -mer compositions of OTU sequences and embeds them into a lower-dimensional continuous latent space, while maximally preserving topological relationships among OTUs. The high imbalance among KEGG Orthology (KO) functions of microbes is recognized in our framework that usually yields poor performance. A hierarchical multitask learning module is used in HOPE to alleviate the challenge brought by the long-tailed distribution among classes. To test the performance of HOPE, we compare it with HOPE-one, HOPE-seq, and GraphSAGE, respectively, in three microbial metagenomic 16s rRNA sequencing datasets, including abalone gut, human gut, and gut of . Experiments demonstrate that HOPE outperforms baselines on almost all indexes in all experiments. Furthermore, HOPE reveals significant generalization ability. HOPE's basic idea is suitable for other related scenarios, such as the prediction of gene function based on gene co-expression networks. The source code of HOPE is freely available at https://github.com/adrift00/HOPE.

摘要

将16S rRNA基因测序数据与代谢参考数据库进行匹配是预测细菌和古菌代谢功能的一种有意义的方法,能为微生物群落的运作带来更深入的见解。然而,一些操作分类单元(OTU)无法进行功能分析,特别是对于在有缺陷培养基中培养的非人类样本的微生物群落。因此,我们在此报告通过图聚合嵌入进行分层微生物功能预测(HOPE)的开发,它利用共现模式和核苷酸序列来预测微生物功能。HOPE将微生物共现网络的拓扑结构与OTU序列的-mer组成整合在一起,并将它们嵌入到一个低维连续潜在空间中,同时最大程度地保留OTU之间的拓扑关系。我们的框架认识到微生物的京都基因与基因组百科全书(KEGG)直系同源(KO)功能之间存在高度不平衡,这通常会导致性能不佳。HOPE中使用了一个分层多任务学习模块来缓解类别间长尾分布带来的挑战。为了测试HOPE的性能,我们分别在三个微生物宏基因组16s rRNA测序数据集(包括鲍鱼肠道、人类肠道和[此处原文缺失物种信息]的肠道)中将其与HOPE-one、HOPE-seq和GraphSAGE进行比较。实验表明,HOPE在所有实验的几乎所有指标上都优于基线。此外,HOPE显示出显著的泛化能力。HOPE的基本思想适用于其他相关场景,例如基于基因共表达网络的基因功能预测。HOPE的源代码可在https://github.com/adrift00/HOPE上免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/90ff27957aa4/fgene-11-608512-g0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/f114b5700259/fgene-11-608512-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/d4c3c3f0f945/fgene-11-608512-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/d609141b678e/fgene-11-608512-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/67e1b72c4a5f/fgene-11-608512-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/050910a5bcad/fgene-11-608512-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/a1d99555a07a/fgene-11-608512-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/f46d2b2ffc11/fgene-11-608512-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/c5ed086ac6ee/fgene-11-608512-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/90ff27957aa4/fgene-11-608512-g0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/f114b5700259/fgene-11-608512-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/d4c3c3f0f945/fgene-11-608512-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/d609141b678e/fgene-11-608512-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/67e1b72c4a5f/fgene-11-608512-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/050910a5bcad/fgene-11-608512-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/a1d99555a07a/fgene-11-608512-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/f46d2b2ffc11/fgene-11-608512-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/c5ed086ac6ee/fgene-11-608512-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47c6/7874084/90ff27957aa4/fgene-11-608512-g0009.jpg

相似文献

1
Hierarchical Microbial Functions Prediction by Graph Aggregated Embedding.基于图聚合嵌入的分层微生物功能预测
Front Genet. 2021 Jan 18;11:608512. doi: 10.3389/fgene.2020.608512. eCollection 2020.
2
PanFP: pangenome-based functional profiles for microbial communities.PanFP:基于全基因组的微生物群落功能概况
BMC Res Notes. 2015 Sep 26;8:479. doi: 10.1186/s13104-015-1462-8.
3
Improved OTU-picking using long-read 16S rRNA gene amplicon sequencing and generic hierarchical clustering.利用长读长16S rRNA基因扩增子测序和通用层次聚类改进操作分类单元(OTU)挑选
Microbiome. 2015 Oct 5;3:43. doi: 10.1186/s40168-015-0105-6.
4
TaxAss: Leveraging a Custom Freshwater Database Achieves Fine-Scale Taxonomic Resolution.TaxAss:利用自定义淡水数据库实现精细分类学分辨率。
mSphere. 2018 Sep 5;3(5):e00327-18. doi: 10.1128/mSphere.00327-18.
5
Graph Embedding Deep Learning Guides Microbial Biomarkers' Identification.图嵌入深度学习助力微生物生物标志物识别。
Front Genet. 2019 Nov 22;10:1182. doi: 10.3389/fgene.2019.01182. eCollection 2019.
6
DBH: A de Bruijn graph-based heuristic method for clustering large-scale 16S rRNA sequences into OTUs.DBH:一种基于德布鲁因图的启发式方法,用于将大规模16S rRNA序列聚类为操作分类单元。
J Theor Biol. 2017 Jul 21;425:80-87. doi: 10.1016/j.jtbi.2017.04.019. Epub 2017 Apr 26.
7
Modified RNA-seq method for microbial community and diversity analysis using rRNA in different types of environmental samples.利用不同类型环境样本中的rRNA进行微生物群落和多样性分析的改良RNA测序方法。
PLoS One. 2017 Oct 10;12(10):e0186161. doi: 10.1371/journal.pone.0186161. eCollection 2017.
8
GMEmbeddings: An R Package to Apply Embedding Techniques to Microbiome Data.GM嵌入:一个将嵌入技术应用于微生物组数据的R包。
Front Bioinform. 2022 Apr 26;2:828703. doi: 10.3389/fbinf.2022.828703. eCollection 2022.
9
MtHc: a motif-based hierarchical method for clustering massive 16S rRNA sequences into OTUs.MtHc:一种基于基序的层次化方法,用于将大量16S rRNA序列聚类为操作分类单元。
Mol Biosyst. 2015 Jul;11(7):1907-13. doi: 10.1039/c5mb00089k.
10
FLONE: fully Lorentz network embedding for inferring novel drug targets.FLONE:用于推断新型药物靶点的全洛伦兹网络嵌入
Bioinform Adv. 2023 May 24;3(1):vbad066. doi: 10.1093/bioadv/vbad066. eCollection 2023.

引用本文的文献

1
Functional redundancy and niche specialization in honeybee and Varroa microbiomes.蜜蜂和瓦螨微生物群中的功能冗余与生态位特化
Int Microbiol. 2025 Apr;28(4):795-810. doi: 10.1007/s10123-024-00582-y. Epub 2024 Aug 22.
2
Rapid evolution of a novel protective symbiont into keystone taxon in Caenorhabditis elegans microbiota.新型保护共生体在秀丽隐杆线虫微生物组中迅速进化为关键分类群。
Sci Rep. 2022 Aug 18;12(1):14045. doi: 10.1038/s41598-022-18269-7.

本文引用的文献

1
KmerGO: A Tool to Identify Group-Specific Sequences With -mers.KmerGO:一种用于通过k聚体识别特定群体序列的工具。
Front Microbiol. 2020 Aug 25;11:2067. doi: 10.3389/fmicb.2020.02067. eCollection 2020.
2
Charting the Complexity of the Marine Microbiome through Single-Cell Genomics.通过单细胞基因组学绘制海洋微生物组的复杂性图谱。
Cell. 2019 Dec 12;179(7):1623-1635.e11. doi: 10.1016/j.cell.2019.11.017.
3
Microbial functional diversity: From concepts to applications.微生物功能多样性:从概念到应用
Ecol Evol. 2019 Oct 2;9(20):12000-12016. doi: 10.1002/ece3.5670. eCollection 2019 Oct.
4
deepNF: deep network fusion for protein function prediction.深度网络融合的蛋白质功能预测。
Bioinformatics. 2018 Nov 15;34(22):3873-3881. doi: 10.1093/bioinformatics/bty440.
5
Identifying Sequences for Microbial Communities Using Long -mer Sequence Signatures.使用长序列特征识别微生物群落的序列
Front Microbiol. 2018 May 3;9:872. doi: 10.3389/fmicb.2018.00872. eCollection 2018.
6
DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier.DeepGO:使用深度本体感知分类器从序列和相互作用预测蛋白质功能。
Bioinformatics. 2018 Feb 15;34(4):660-668. doi: 10.1093/bioinformatics/btx624.
7
Cost-Sensitive Learning of Deep Feature Representations From Imbalanced Data.从不平衡数据中进行深度特征表示的成本敏感学习。
IEEE Trans Neural Netw Learn Syst. 2018 Aug;29(8):3573-3587. doi: 10.1109/TNNLS.2017.2732482. Epub 2017 Aug 17.
8
Proteome analysis for the global proteins in the jejunum tissues of enterotoxigenic Escherichia coli -infected piglets.产肠毒素大肠杆菌感染仔猪空肠组织中全局蛋白质的蛋白质组分析。
Sci Rep. 2016 May 9;6:25640. doi: 10.1038/srep25640.
9
Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data.Tax4Fun:从宏基因组16S rRNA数据预测功能概况。
Bioinformatics. 2015 Sep 1;31(17):2882-4. doi: 10.1093/bioinformatics/btv287. Epub 2015 May 7.
10
Belowground biodiversity and ecosystem functioning.地下生物多样性与生态系统功能。
Nature. 2014 Nov 27;515(7528):505-11. doi: 10.1038/nature13855.