Suppr超能文献

用于训练和测试共现网络推理算法的交叉验证。

Cross-validation for training and testing co-occurrence network inference algorithms.

作者信息

Agyapong Daniel, Propster Jeffrey Ryan, Marks Jane, Hocking Toby Dylan

机构信息

School of Informatics, Computing, and Cyber Systems, Northern Arizona University, Flagstaff, AZ, USA.

Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ, USA.

出版信息

BMC Bioinformatics. 2025 Mar 6;26(1):74. doi: 10.1186/s12859-025-06083-7.

Abstract

BACKGROUND

Microorganisms are found in almost every environment, including soil, water, air and inside other organisms, such as animals and plants. While some microorganisms cause diseases, most of them help in biological processes such as decomposition, fermentation and nutrient cycling. Much research has been conducted on the study of microbial communities in various environments and how their interactions and relationships can provide insight into various diseases. Co-occurrence network inference algorithms help us understand the complex associations of micro-organisms, especially bacteria. Existing network inference algorithms employ techniques such as correlation, regularized linear regression, and conditional dependence, which have different hyper-parameters that determine the sparsity of the network. These complex microbial communities form intricate ecological networks that are fundamental to ecosystem functioning and host health. Understanding these networks is crucial for developing targeted interventions in both environmental and clinical settings. The emergence of high-throughput sequencing technologies has generated unprecedented amounts of microbiome data, necessitating robust computational methods for network inference and validation.

RESULTS

Previous methods for evaluating the quality of the inferred network include using external data, and network consistency across sub-samples, both of which have several drawbacks that limit their applicability in real microbiome composition data sets. We propose a novel cross-validation method to evaluate co-occurrence network inference algorithms, and new methods for applying existing algorithms to predict on test data. Our method demonstrates superior performance in handling compositional data and addressing the challenges of high dimensionality and sparsity inherent in real microbiome datasets. The proposed framework also provides robust estimates of network stability.

CONCLUSIONS

Our empirical study shows that the proposed cross-validation method is useful for hyper-parameter selection (training) and comparing the quality of inferred networks between different algorithms (testing). This advancement represents a significant step forward in microbiome network analysis, providing researchers with a reliable tool for understanding complex microbial interactions. The method's applicability extends beyond microbiome studies to other fields where network inference from high-dimensional compositional data is crucial, such as gene regulatory networks and ecological food webs. Our framework establishes a new standard for validation in network inference, potentially accelerating discoveries in microbial ecology and human health.

摘要

背景

微生物几乎存在于每一种环境中,包括土壤、水、空气以及其他生物体(如动物和植物)的内部。虽然一些微生物会引发疾病,但它们中的大多数有助于生物过程,如分解、发酵和养分循环。针对各种环境中微生物群落的研究以及它们的相互作用和关系如何为各种疾病提供见解,已经开展了大量研究。共现网络推断算法有助于我们理解微生物(尤其是细菌)之间的复杂关联。现有的网络推断算法采用相关性、正则化线性回归和条件依赖等技术,这些技术具有不同的超参数,决定了网络的稀疏性。这些复杂的微生物群落形成了错综复杂的生态网络,这对生态系统功能和宿主健康至关重要。了解这些网络对于在环境和临床环境中制定有针对性的干预措施至关重要。高通量测序技术的出现产生了前所未有的大量微生物组数据,因此需要强大的计算方法来进行网络推断和验证。

结果

先前评估推断网络质量的方法包括使用外部数据以及子样本间的网络一致性,但这两种方法都存在一些缺点,限制了它们在实际微生物组组成数据集中的适用性。我们提出了一种新颖的交叉验证方法来评估共现网络推断算法,以及将现有算法应用于测试数据预测的新方法。我们的方法在处理成分数据以及应对实际微生物组数据集中固有的高维度和稀疏性挑战方面表现出卓越性能。所提出的框架还提供了网络稳定性的可靠估计。

结论

我们的实证研究表明,所提出的交叉验证方法对于超参数选择(训练)以及比较不同算法推断网络的质量(测试)很有用。这一进展代表了微生物组网络分析向前迈出的重要一步,为研究人员提供了一个理解复杂微生物相互作用的可靠工具。该方法的适用性不仅限于微生物组研究,还扩展到其他领域,在这些领域中,从高维成分数据进行网络推断至关重要,例如基因调控网络和生态食物网。我们的框架为网络推断中的验证建立了新的标准,可能会加速微生物生态学和人类健康领域的发现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea3f/11883995/2b4864528abe/12859_2025_6083_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验