• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习辅助生物标志物识别预测厌氧混合培养物中中链羧酸的生产性能。

Machine learning-assisted identification of bioindicators predicts medium-chain carboxylate production performance of an anaerobic mixed culture.

机构信息

Department of Environmental Microbiology, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany.

Institute for Bioengineering and Biosciences, Department of Bioengineering, Instituto Superior Técnico Universidade de Lisboa, Lisbon, Portugal.

出版信息

Microbiome. 2022 Mar 25;10(1):48. doi: 10.1186/s40168-021-01219-2.

DOI:10.1186/s40168-021-01219-2
PMID:35331330
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8952268/
Abstract

BACKGROUND

The ability to quantitatively predict ecophysiological functions of microbial communities provides an important step to engineer microbiota for desired functions related to specific biochemical conversions. Here, we present the quantitative prediction of medium-chain carboxylate production in two continuous anaerobic bioreactors from 16S rRNA gene dynamics in enriched communities.

RESULTS

By progressively shortening the hydraulic retention time (HRT) from 8 to 2 days with different temporal schemes in two bioreactors operated for 211 days, we achieved higher productivities and yields of the target products n-caproate and n-caprylate. The datasets generated from each bioreactor were applied independently for training and testing machine learning algorithms using 16S rRNA genes to predict n-caproate and n-caprylate productivities. Our dataset consisted of 14 and 40 samples from HRT of 8 and 2 days, respectively. Because of the size and balance of our dataset, we compared linear regression, support vector machine and random forest regression algorithms using the original and balanced datasets generated using synthetic minority oversampling. Further, we performed cross-validation to estimate model stability. The random forest regression was the best algorithm producing more consistent results with median of error rates below 8%. More than 90% accuracy in the prediction of n-caproate and n-caprylate productivities was achieved. Four inferred bioindicators belonging to the genera Olsenella, Lactobacillus, Syntrophococcus and Clostridium IV suggest their relevance to the higher carboxylate productivity at shorter HRT. The recovery of metagenome-assembled genomes of these bioindicators confirmed their genetic potential to perform key steps of medium-chain carboxylate production.

CONCLUSIONS

Shortening the hydraulic retention time of the continuous bioreactor systems allows to shape the communities with desired chain elongation functions. Using machine learning, we demonstrated that 16S rRNA amplicon sequencing data can be used to predict bioreactor process performance quantitatively and accurately. Characterizing and harnessing bioindicators holds promise to manage reactor microbiota towards selection of the target processes. Our mathematical framework is transferrable to other ecosystem processes and microbial systems where community dynamics is linked to key functions. The general methodology used here can be adapted to data types of other functional categories such as genes, transcripts, proteins or metabolites. Video Abstract.

摘要

背景

定量预测微生物群落的生态生理学功能为实现与特定生化转化相关的功能而对微生物群落进行工程改造提供了重要步骤。在这里,我们根据富集群落中的 16S rRNA 基因动态,定量预测了两个连续厌氧生物反应器中中链羧酸的产生。

结果

通过在两个生物反应器中以不同的时间方案逐渐将水力停留时间(HRT)从 8 天缩短至 2 天,我们在运行 211 天的情况下实现了目标产物正己酸和正辛酸的更高生产力和产率。从每个生物反应器生成的数据集分别用于训练和测试使用 16S rRNA 基因预测正己酸和正辛酸生产力的机器学习算法。我们的数据集由分别来自 HRT 为 8 天和 2 天的 14 个和 40 个样本组成。由于数据集的大小和平衡,我们比较了使用原始数据集和使用合成少数族裔过采样生成的平衡数据集的线性回归、支持向量机和随机森林回归算法。此外,我们进行了交叉验证以估计模型稳定性。随机森林回归是最好的算法,其误差率中位数低于 8%,结果更为一致。在正己酸和正辛酸生产力的预测中,达到了 90%以上的准确率。属于 Olsenella、Lactobacillus、Syntrophococcus 和 Clostridium IV 属的四个推断的生物标志物表明它们与较短 HRT 下更高的羧酸产量相关。这些生物标志物的宏基因组组装基因组的恢复证实了它们在进行中链羧酸产生的关键步骤中的遗传潜力。

结论

缩短连续生物反应器系统的水力停留时间可以使群落具有所需的链延伸功能。使用机器学习,我们证明了 16S rRNA 扩增子测序数据可用于定量和准确地预测生物反应器过程性能。对生物标志物进行特征描述和利用有望实现对反应器微生物群落的管理,以选择目标过程。我们的数学框架可应用于其他生态系统过程和微生物系统,其中群落动态与关键功能相关。这里使用的一般方法可以适用于其他功能类别(如基因、转录本、蛋白质或代谢物)的数据类型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/27d4d9329f3d/40168_2021_1219_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/90cbfe65f62f/40168_2021_1219_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/00aebac97efc/40168_2021_1219_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/079bc12356e7/40168_2021_1219_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/13e2327e1128/40168_2021_1219_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/ff0508ec406d/40168_2021_1219_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/b4831de45e25/40168_2021_1219_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/6cd39bd259c5/40168_2021_1219_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/27d4d9329f3d/40168_2021_1219_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/90cbfe65f62f/40168_2021_1219_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/00aebac97efc/40168_2021_1219_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/079bc12356e7/40168_2021_1219_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/13e2327e1128/40168_2021_1219_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/ff0508ec406d/40168_2021_1219_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/b4831de45e25/40168_2021_1219_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/6cd39bd259c5/40168_2021_1219_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8952268/27d4d9329f3d/40168_2021_1219_Fig8_HTML.jpg

相似文献

1
Machine learning-assisted identification of bioindicators predicts medium-chain carboxylate production performance of an anaerobic mixed culture.机器学习辅助生物标志物识别预测厌氧混合培养物中中链羧酸的生产性能。
Microbiome. 2022 Mar 25;10(1):48. doi: 10.1186/s40168-021-01219-2.
2
Key sub-community dynamics of medium-chain carboxylate production.关键亚社区的中链羧酸产生动力学。
Microb Cell Fact. 2019 May 28;18(1):92. doi: 10.1186/s12934-019-1143-8.
3
Competition Between Butyrate Fermenters and Chain-Elongating Bacteria Limits the Efficiency of Medium-Chain Carboxylate Production.丁酸盐发酵菌与链延长细菌之间的竞争限制了中链羧酸盐的生产效率。
Front Microbiol. 2020 Mar 6;11:336. doi: 10.3389/fmicb.2020.00336. eCollection 2020.
4
Variability in -caprylate and -caproate producing microbiomes in reactors with in-line product extraction.在线产物提取反应器中产生辛酸盐和己酸盐的微生物组的变异性。
mSystems. 2024 Aug 20;9(8):e0041624. doi: 10.1128/msystems.00416-24. Epub 2024 Jul 11.
5
Functional Redundancy Secures Resilience of Chain Elongation Communities upon pH Shifts in Closed Bioreactor Ecosystems.在封闭生物反应器生态系统中,功能冗余确保了链延伸群落对 pH 变化的恢复力。
Environ Sci Technol. 2023 Nov 21;57(46):18350-18361. doi: 10.1021/acs.est.2c09573. Epub 2023 Apr 25.
6
Machine Learning Predicts Biogeochemistry from Microbial Community Structure in a Complex Model System.机器学习从复杂模型系统中的微生物群落结构预测生物地球化学。
Microbiol Spectr. 2022 Feb 23;10(1):e0190921. doi: 10.1128/spectrum.01909-21. Epub 2022 Feb 9.
7
Effect of high concentration of ammonium on production of n-caproate: Recovery of a high-value biochemical from food waste via lactate-driven chain elongation.高浓度氨对己酸生成的影响:通过乳酸驱动的链延伸从食物垃圾中回收高价值生化物质。
Waste Manag. 2021 Jun 1;128:25-35. doi: 10.1016/j.wasman.2021.04.015. Epub 2021 May 3.
8
Microbial Ecological Mechanism for Long-Term Production of High Concentrations of -Caproate via Lactate-Driven Chain Elongation.通过乳酸驱动的链延伸长期生产高浓度 - 己酸的微生物生态机制。
Appl Environ Microbiol. 2021 May 11;87(11). doi: 10.1128/AEM.03075-20.
9
Long-term, selective production of caproate in an anaerobic membrane bioreactor.在厌氧膜生物反应器中进行长期、选择性的己酸生产。
Bioresour Technol. 2020 Apr;302:122865. doi: 10.1016/j.biortech.2020.122865. Epub 2020 Jan 23.
10
Waste Conversion into -Caprylate and -Caproate: Resource Recovery from Wine Lees Using Anaerobic Reactor Microbiomes and In-line Extraction.将废弃物转化为己酸乙酯和辛酸乙酯:利用厌氧反应器微生物群落和在线萃取从酒糟中回收资源
Front Microbiol. 2016 Nov 24;7:1892. doi: 10.3389/fmicb.2016.01892. eCollection 2016.

引用本文的文献

1
Artificial intelligence: the human response to approach the complexity of big data in biology.人工智能:人类应对生物学大数据复杂性的方式
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf057.
2
Three-domain microbial communities in the gut of Pachnoda marginata larvae: A comparative study revealing opposing trends in gut compartments.大袋蛾幼虫肠道中的三域微生物群落:揭示肠道隔室中相反趋势的比较研究。
Environ Microbiol Rep. 2024 Aug;16(4):e13324. doi: 10.1111/1758-2229.13324.
3
A toolbox of machine learning software to support microbiome analysis.

本文引用的文献

1
Mashtree: a rapid comparison of whole genome sequence files.Mashtree:全基因组序列文件的快速比较
J Open Source Softw. 2019 Dec 10;4(44). doi: 10.21105/joss.01762.
2
Mining Synergistic Microbial Interactions: A Roadmap on How to Integrate Multi-Omics Data.挖掘协同微生物相互作用:整合多组学数据的路线图。
Microorganisms. 2021 Apr 14;9(4):840. doi: 10.3390/microorganisms9040840.
3
Impact of process temperature and organic loading rate on cellulolytic / hydrolytic biofilm microbiomes during biomethanation of ryegrass silage revealed by genome-centered metagenomics and metatranscriptomics.
一个支持微生物组分析的机器学习软件工具箱。
Front Microbiol. 2023 Nov 22;14:1250806. doi: 10.3389/fmicb.2023.1250806. eCollection 2023.
4
Artificial intelligence-driven systems engineering for next-generation plant-derived biopharmaceuticals.用于下一代植物源生物制药的人工智能驱动的系统工程。
Front Plant Sci. 2023 Nov 15;14:1252166. doi: 10.3389/fpls.2023.1252166. eCollection 2023.
5
Applications of Big Data and AI-Driven Technologies in CADD (Computer-Aided Drug Design).大数据和人工智能驱动技术在计算机辅助药物设计(CADD)中的应用。
Methods Mol Biol. 2024;2714:295-305. doi: 10.1007/978-1-0716-3441-7_16.
6
A metagenome-level analysis of a microbial community fermenting ultra-filtered milk permeate.对发酵超滤乳清渗透物的微生物群落进行的宏基因组水平分析。
Front Bioeng Biotechnol. 2023 May 17;11:1173656. doi: 10.3389/fbioe.2023.1173656. eCollection 2023.
基于基因组的宏基因组学和宏转录组学揭示黑麦草青贮生物甲烷化过程中工艺温度和有机负荷率对纤维素分解/水解生物膜微生物群落的影响
Environ Microbiome. 2020 Mar 2;15(1):7. doi: 10.1186/s40793-020-00354-x.
4
Microbiome meta-analysis and cross-disease comparison enabled by the SIAMCAT machine learning toolbox.通过 SIAMCAT 机器学习工具箱进行微生物组荟萃分析和跨疾病比较。
Genome Biol. 2021 Mar 30;22(1):93. doi: 10.1186/s13059-021-02306-1.
5
Microbial Ecological Mechanism for Long-Term Production of High Concentrations of -Caproate via Lactate-Driven Chain Elongation.通过乳酸驱动的链延伸长期生产高浓度 - 己酸的微生物生态机制。
Appl Environ Microbiol. 2021 May 11;87(11). doi: 10.1128/AEM.03075-20.
6
Measuring the microbiome: Best practices for developing and benchmarking microbiomics methods.微生物组测量:微生物组学方法开发与基准测试的最佳实践
Comput Struct Biotechnol J. 2020 Dec 3;18:4048-4062. doi: 10.1016/j.csbj.2020.11.049. eCollection 2020.
7
Machine-learning-driven biomarker discovery for the discrimination between allergic and irritant contact dermatitis.基于机器学习的生物标志物发现用于区分变应性接触性皮炎和刺激性接触性皮炎。
Proc Natl Acad Sci U S A. 2020 Dec 29;117(52):33474-33485. doi: 10.1073/pnas.2009192117. Epub 2020 Dec 14.
8
Mildly acidic pH selects for chain elongation to caproic acid over alternative pathways during lactic acid fermentation.在乳酸发酵过程中,弱酸性 pH 值通过选择链延伸到己酸来替代其他途径。
Water Res. 2020 Nov 1;186:116396. doi: 10.1016/j.watres.2020.116396. Epub 2020 Sep 7.
9
A heterogeneous microbial consortium producing short-chain fatty acids from lignocellulose.一种产生短链脂肪酸的木质纤维素异质微生物联合体。
Science. 2020 Aug 28;369(6507). doi: 10.1126/science.abb1214.
10
Carboxylic acid production from ensiled crops in anaerobic solid-state fermentation - trace elements as pH controlling agents support microbial chain elongation with lactic acid.厌氧固态发酵中青贮作物生产羧酸——作为pH调节剂的微量元素支持微生物链延长并生成乳酸。
Eng Life Sci. 2018 May 14;18(7):447-458. doi: 10.1002/elsc.201700186. eCollection 2018 Jul.