Suppr超能文献

CODARFE:基于微生物组实现对连续环境变量的预测

CODARFE: Unlocking the prediction of continuous environmental variables based on microbiome.

作者信息

Barbosa Murilo Caminotto, Marques da Silva João Fernando, Alves Leonardo Cardoso, Finn Robert D, Paschoal Alexandre Rossi

机构信息

Department of Computer Science (DACOM), Universidade Tecnológica Federal do Paraná (UTFPR), Campus Cornélio Procópio, 86300-000, Paraná, Brazil.

Centro de Inovação, Superbac Biotechnology Solutions, 86975-000, Mandaguari, Brazil.

出版信息

Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf055.

Abstract

BACKGROUND

Despite the surge in microbiome data acquisition, there is a limited availability of tools capable of effectively analyzing it and identifying correlations between taxonomic compositions and continuous environmental factors. Furthermore, existing tools also do not predict the environmental factors in new samples, underscoring the pressing need for innovative solutions to enhance our understanding of microbiome dynamics and fulfill the prediction gap. Here we introduce CODARFE, a novel tool for sparse compositional microbiome predictor selection and prediction of continuous environmental factors.

RESULTS

We tested CODARFE against 4 state-of-the-art tools in 2 experiments. First, CODARFE outperformed predictor selection in 21 of 24 databases in terms of correlation. Second, among all the tools, CODARFE achieved the highest number of previously identified bacteria linked to environmental factors for human data-that is, at least 7% more. We also tested CODARFE in a cross-study, using the same biome but under different external effects, using a model trained on 1 dataset to predict environmental factors on another dataset, achieving 11% of mean absolute percentage error. Finally, CODARFE is available in 5 formats, including a Windows version with a graphical interface, to installable source code for Linux servers and an embedded Jupyter notebook available at MGnify.

CONCLUSIONS

Our findings underscore the robustness and broad applicability of CODARFE across diverse fields, even under varying experimental conditions. Additionally, the ability to predict outcomes in new samples allows for the generation of new insights in previously unexplored contexts, providing researchers with a versatile tool.

摘要

背景

尽管微生物组数据采集量激增,但能够有效分析这些数据并识别分类组成与连续环境因素之间相关性的工具却很有限。此外,现有工具也无法预测新样本中的环境因素,这凸显了迫切需要创新解决方案来加深我们对微生物组动态的理解并填补预测差距。在此,我们介绍CODARFE,这是一种用于稀疏组成微生物组预测器选择和连续环境因素预测的新型工具。

结果

我们在两项实验中针对4种最先进的工具对CODARFE进行了测试。首先,在相关性方面,CODARFE在24个数据库中的21个中优于预测器选择。其次,在所有工具中,CODARFE在与人类数据环境因素相关的先前鉴定细菌数量方面达到了最高值——即至少多出7%。我们还在一项跨研究中对CODARFE进行了测试,使用相同的生物群落但处于不同的外部影响下,使用在一个数据集上训练的模型来预测另一个数据集上的环境因素,平均绝对百分比误差达到了11%。最后,CODARFE有5种格式可供使用,包括带有图形界面的Windows版本、适用于Linux服务器的可安装源代码以及MGnify上提供的嵌入式Jupyter笔记本。

结论

我们的研究结果强调了CODARFE在不同领域的稳健性和广泛适用性,即使在不同的实验条件下也是如此。此外,预测新样本结果的能力使得在以前未探索的背景下能够产生新的见解,为研究人员提供了一种多功能工具。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验