Suppr超能文献

基于互信息方法的宏基因组数据中两两关系检测的比较分析

A comparative analysis of mutual information methods for pairwise relationship detection in metagenomic data.

机构信息

Quantitative and Computational Biology Department, University of Southern California, Los Angeles, CA, 90089, USA.

出版信息

BMC Bioinformatics. 2024 Aug 14;25(1):266. doi: 10.1186/s12859-024-05883-7.

Abstract

BACKGROUND

Construction of co-occurrence networks in metagenomic data often employs correlation to infer pairwise relationships between microbes. However, biological systems are complex and often display qualities non-linear in nature. Therefore, the reliance on correlation alone may overlook important relationships and fail to capture the full breadth of intricacies presented in underlying interaction networks. It is of interest to incorporate metrics that are not only robust in detecting linear relationships, but non-linear ones as well.

RESULTS

In this paper, we explore the use of various mutual information (MI) estimation approaches for quantifying pairwise relationships in biological data and compare their performances against two traditional measures-Pearson's correlation coefficient, r, and Spearman's rank correlation coefficient, ρ. Metrics are tested on both simulated data designed to mimic pairwise relationships that may be found in ecological systems and real data from a previous study on C. diff infection. The results demonstrate that, in the case of asymmetric relationships, mutual information estimators can provide better detection ability than Pearson's or Spearman's correlation coefficients. Specifically, we find that these estimators have elevated performances in the detection of exploitative relationships, demonstrating the potential benefit of including them in future metagenomic studies.

CONCLUSIONS

Mutual information (MI) can uncover complex pairwise relationships in biological data that may be missed by traditional measures of association. The inclusion of such relationships when constructing co-occurrence networks can result in a more comprehensive analysis than the use of correlation alone.

摘要

背景

在宏基因组数据中构建共现网络时,通常使用相关性来推断微生物之间的成对关系。然而,生物系统是复杂的,通常表现出非线性的性质。因此,仅仅依赖相关性可能会忽略重要的关系,无法捕捉到潜在相互作用网络中呈现的全部复杂性。因此,有必要纳入不仅在检测线性关系方面稳健,而且在检测非线性关系方面也稳健的指标。

结果

在本文中,我们探索了使用各种互信息 (MI) 估计方法来量化生物数据中的成对关系,并将它们的性能与传统的两个度量标准——皮尔逊相关系数 r 和斯皮尔曼等级相关系数 ρ 进行比较。我们在模拟数据上测试了这些度量标准,这些数据旨在模拟可能存在于生态系统中的成对关系,以及之前关于 C. diff 感染的研究中的真实数据。结果表明,在不对称关系的情况下,互信息估计器比皮尔逊或斯皮尔曼相关系数具有更好的检测能力。具体来说,我们发现这些估计器在检测剥削性关系方面具有更高的性能,这表明在未来的宏基因组研究中包含它们可能会带来潜在的益处。

结论

互信息 (MI) 可以揭示生物数据中可能被传统关联度量所忽略的复杂成对关系。在构建共现网络时,包含这些关系可以比仅使用相关性进行更全面的分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c13e/11323399/8104bdd687ff/12859_2024_5883_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验