Suppr超能文献

SCNIC:成分数据的稀疏相关网络研究。

SCNIC: Sparse correlation network investigation for compositional data.

机构信息

Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA.

Department of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, UK.

出版信息

Mol Ecol Resour. 2023 Jan;23(1):312-325. doi: 10.1111/1755-0998.13704. Epub 2022 Sep 1.

Abstract

Microbiome studies are often limited by a lack of statistical power due to small sample sizes and a large number of features. This problem is exacerbated in correlative studies of multi-omic datasets. Statistical power can be increased by finding and summarizing modules of correlated observations, which is one dimensionality reduction method. Additionally, modules provide biological insight as correlated groups of microbes can have relationships among themselves. To address these challenges, we developed SCNIC: Sparse Cooccurrence Network Investigation for compositional data. SCNIC is open-source software that can generate correlation networks and detect and summarize modules of highly correlated features. Modules can be formed using either the Louvain Modularity Maximization (LMM) algorithm or a Shared Minimum Distance algorithm (SMD) that we newly describe here and relate to LMM using simulated data. We applied SCNIC to two published datasets and we achieved increased statistical power and identified microbes that not only differed across groups, but also correlated strongly with each other, suggesting shared environmental drivers or cooperative relationships among them. SCNIC provides an easy way to generate correlation networks, identify modules of correlated features and summarize them for downstream statistical analysis. Although SCNIC was designed considering properties of microbiome data, such as compositionality and sparsity, it can be applied to a variety of data types including metabolomics data and used to integrate multiple data types. SCNIC allows for the identification of functional microbial relationships at scale while increasing statistical power through feature reduction.

摘要

微生物组研究通常由于样本量小和特征数量多而受到统计能力的限制。在多组学数据集的相关性研究中,这个问题更加严重。通过找到和总结相关观测的模块,可以增加统计能力,这是一种降维方法。此外,模块提供了生物学见解,因为相关的微生物群彼此之间可能存在关系。为了解决这些挑战,我们开发了 SCNIC:用于组成数据的稀疏共现网络调查。SCNIC 是一个开源软件,可以生成相关网络,并检测和总结高度相关特征的模块。模块可以使用 Louvain 模块最大化(LMM)算法或我们在这里新描述的共享最小距离算法(SMD)来形成,我们使用模拟数据将其与 LMM 联系起来。我们将 SCNIC 应用于两个已发表的数据集,我们提高了统计能力,并鉴定了不仅在组间存在差异,而且彼此之间也存在强烈相关性的微生物,这表明它们之间存在共享的环境驱动因素或合作关系。SCNIC 提供了一种生成相关网络、识别相关特征模块并对其进行汇总以进行下游统计分析的简单方法。尽管 SCNIC 是根据微生物组数据的特性(如组成性和稀疏性)设计的,但它可以应用于多种数据类型,包括代谢组学数据,并用于整合多种数据类型。SCNIC 允许在大规模上识别功能微生物关系,同时通过特征减少提高统计能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d653/10087930/bd596a196fa2/MEN-23-312-g003.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验