• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于研究微生物组的深度学习和语言模型的最新进展。

Recent advances in deep learning and language models for studying the microbiome.

作者信息

Yan Binghao, Nam Yunbi, Li Lingyao, Deek Rebecca A, Li Hongzhe, Ma Siyuan

机构信息

Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.

Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, United States.

出版信息

Front Genet. 2025 Jan 7;15:1494474. doi: 10.3389/fgene.2024.1494474. eCollection 2024.

DOI:10.3389/fgene.2024.1494474
PMID:39840283
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11747409/
Abstract

Recent advancements in deep learning, particularly large language models (LLMs), made a significant impact on how researchers study microbiome and metagenomics data. Microbial protein and genomic sequences, like natural languages, form a , enabling the adoption of LLMs to extract useful insights from complex microbial ecologies. In this paper, we review applications of deep learning and language models in analyzing microbiome and metagenomics data. We focus on problem formulations, necessary datasets, and the integration of language modeling techniques. We provide an extensive overview of protein/genomic language modeling and their contributions to microbiome studies. We also discuss applications such as novel viromics language modeling, biosynthetic gene cluster prediction, and knowledge integration for metagenomics studies.

摘要

深度学习领域的最新进展,尤其是大语言模型(LLMs),对研究人员研究微生物组和宏基因组学数据的方式产生了重大影响。微生物蛋白质和基因组序列与自然语言一样,形成了一种 ,使得能够采用大语言模型从复杂的微生物生态中提取有用的见解。在本文中,我们回顾了深度学习和语言模型在分析微生物组和宏基因组学数据方面的应用。我们重点关注问题的提出、必要的数据集以及语言建模技术的整合。我们广泛概述了蛋白质/基因组语言建模及其对微生物组研究的贡献。我们还讨论了诸如新型病毒组学语言建模、生物合成基因簇预测以及宏基因组学研究的知识整合等应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f14/11747409/69d1d9e8e283/fgene-15-1494474-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f14/11747409/1b421cf1a9e3/fgene-15-1494474-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f14/11747409/69d1d9e8e283/fgene-15-1494474-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f14/11747409/1b421cf1a9e3/fgene-15-1494474-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f14/11747409/69d1d9e8e283/fgene-15-1494474-g002.jpg

相似文献

1
Recent advances in deep learning and language models for studying the microbiome.用于研究微生物组的深度学习和语言模型的最新进展。
Front Genet. 2025 Jan 7;15:1494474. doi: 10.3389/fgene.2024.1494474. eCollection 2024.
2
Large Language Models in Worldwide Medical Exams: Platform Development and Comprehensive Analysis.全球医学考试中的大语言模型:平台开发与综合分析
J Med Internet Res. 2024 Dec 27;26:e66114. doi: 10.2196/66114.
3
Understanding natural language: Potential application of large language models to ophthalmology.理解自然语言:大型语言模型在眼科学中的潜在应用。
Asia Pac J Ophthalmol (Phila). 2024 Jul-Aug;13(4):100085. doi: 10.1016/j.apjo.2024.100085. Epub 2024 Jul 25.
4
Deep learning in microbiome analysis: a comprehensive review of neural network models.微生物组分析中的深度学习:神经网络模型综述
Front Microbiol. 2025 Jan 22;15:1516667. doi: 10.3389/fmicb.2024.1516667. eCollection 2024.
5
Large language models and their applications in bioinformatics.大语言模型及其在生物信息学中的应用。
Comput Struct Biotechnol J. 2024 Oct 5;23:3498-3505. doi: 10.1016/j.csbj.2024.09.031. eCollection 2024 Dec.
6
Industrial applications of large language models.大语言模型的工业应用。
Sci Rep. 2025 Apr 21;15(1):13755. doi: 10.1038/s41598-025-98483-1.
7
Combining metagenomics, metatranscriptomics and viromics to explore novel microbial interactions: towards a systems-level understanding of human microbiome.结合宏基因组学、宏转录组学和病毒组学来探索新型微生物相互作用:迈向对人类微生物组的系统层面理解。
Comput Struct Biotechnol J. 2015 Jun 9;13:390-401. doi: 10.1016/j.csbj.2015.06.001. eCollection 2015.
8
Assessing the Alignment of Large Language Models With Human Values for Mental Health Integration: Cross-Sectional Study Using Schwartz's Theory of Basic Values.评估大型语言模型与人类心理健康整合价值观的一致性:使用施瓦茨基本价值观理论的横断面研究。
JMIR Ment Health. 2024 Apr 9;11:e55988. doi: 10.2196/55988.
9
A Review of The Opportunities and Challenges with Large Language Models in Radiology: The Road Ahead.放射学中大型语言模型的机遇与挑战综述:前行之路
AJNR Am J Neuroradiol. 2024 Nov 21. doi: 10.3174/ajnr.A8589.
10
Large language models: a primer and gastroenterology applications.大语言模型:入门介绍及胃肠病学应用
Therap Adv Gastroenterol. 2024 Feb 22;17:17562848241227031. doi: 10.1177/17562848241227031. eCollection 2024.

引用本文的文献

1
MicroRNAs in long COVID: roles, diagnostic biomarker potential and detection.长新冠中的微小RNA:作用、诊断生物标志物潜力及检测
Hum Genomics. 2025 Aug 13;19(1):90. doi: 10.1186/s40246-025-00810-0.
2
Metagenomic analysis reveals the diversity of the vaginal virome and its association with vaginitis.宏基因组分析揭示了阴道病毒组的多样性及其与阴道炎的关联。
Front Cell Infect Microbiol. 2025 Apr 3;15:1582553. doi: 10.3389/fcimb.2025.1582553. eCollection 2025.
3
Non-coding RNAs: the architects of placental development and pregnancy success.

本文引用的文献

1
Nucleotide Transformer: building and evaluating robust foundation models for human genomics.核苷酸变换器:构建和评估用于人类基因组学的强大基础模型。
Nat Methods. 2025 Feb;22(2):287-297. doi: 10.1038/s41592-024-02523-z. Epub 2024 Nov 28.
2
ViraLM: empowering virus discovery through the genome foundation model.ViraLM:通过基因组基础模型助力病毒发现
Bioinformatics. 2024 Nov 28;40(12). doi: 10.1093/bioinformatics/btae704.
3
Prediction of virus-host associations using protein language models and multiple instance learning.使用蛋白质语言模型和多实例学习预测病毒-宿主关联
非编码RNA:胎盘发育与妊娠成功的构建者。
Mol Genet Genomics. 2025 Mar 30;300(1):39. doi: 10.1007/s00438-025-02244-8.
PLoS Comput Biol. 2024 Nov 19;20(11):e1012597. doi: 10.1371/journal.pcbi.1012597. eCollection 2024 Nov.
4
Genomic language model predicts protein co-regulation and function.基因组语言模型预测蛋白质的共同调控和功能。
Nat Commun. 2024 Apr 3;15(1):2880. doi: 10.1038/s41467-024-46947-9.
5
Species-aware DNA language models capture regulatory elements and their evolution.物种感知的 DNA 语言模型可以捕获调控元件及其进化。
Genome Biol. 2024 Apr 2;25(1):83. doi: 10.1186/s13059-024-03221-x.
6
Large language models improve annotation of prokaryotic viral proteins.大语言模型提高原核病毒蛋白的注释效果。
Nat Microbiol. 2024 Feb;9(2):537-549. doi: 10.1038/s41564-023-01584-8. Epub 2024 Jan 29.
7
ProkBERT family: genomic language models for microbiome applications.ProkBERT家族:用于微生物组应用的基因组语言模型。
Front Microbiol. 2024 Jan 12;14:1331233. doi: 10.3389/fmicb.2023.1331233. eCollection 2023.
8
Machine learning and deep learning applications in microbiome research.机器学习与深度学习在微生物组研究中的应用。
ISME Commun. 2022 Oct 6;2(1):98. doi: 10.1038/s43705-022-00182-9.
9
ProGen2: Exploring the boundaries of protein language models.ProGen2:探索蛋白质语言模型的边界。
Cell Syst. 2023 Nov 15;14(11):968-978.e3. doi: 10.1016/j.cels.2023.10.002. Epub 2023 Oct 30.
10
Generative models for protein sequence modeling: recent advances and future directions.蛋白质序列建模的生成模型:最新进展和未来方向。
Brief Bioinform. 2023 Sep 22;24(6). doi: 10.1093/bib/bbad358.