Suppr超能文献

多维缩放改进了基于距离的微生物组数据聚类方法。

Multidimensional scaling improves distance-based clustering for microbiome data.

作者信息

Chen Guanhua, Wang Xinyue, Sun Qiang, Tang Zheng-Zheng

机构信息

Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53726, United States.

Department of Statistics, Pennsylvania State University, University Park, PA 16802, United States.

出版信息

Bioinformatics. 2025 Feb 4;41(2). doi: 10.1093/bioinformatics/btaf042.

Abstract

MOTIVATION

Clustering patients into subgroups based on their microbial compositions can greatly enhance our understanding of the role of microbes in human health and disease etiology. Distance-based clustering methods, such as partitioning around medoids (PAM), are popular due to their computational efficiency and absence of distributional assumptions. However, the performance of these methods can be suboptimal when true cluster memberships are driven by differences in the abundance of only a few microbes, a situation known as the sparse signal scenario.

RESULTS

We demonstrate that classical multidimensional scaling (MDS), a widely used dimensionality reduction technique, effectively denoises microbiome data and enhances the clustering performance of distance-based methods. We propose a two-step procedure that first applies MDS to project high-dimensional microbiome data into a low-dimensional space, followed by distance-based clustering using the low-dimensional data. Our extensive simulations demonstrate that our procedure offers superior performance compared to directly conducting distance-based clustering under the sparse signal scenario. The advantage of our procedure is further showcased in several real data applications.

AVAILABILITY AND IMPLEMENTATION

The R package MDSMClust is available at https://github.com/wxy929/MDS-project.

摘要

动机

根据微生物组成将患者聚类为亚组,能够极大地增进我们对微生物在人类健康和疾病病因学中作用的理解。基于距离的聚类方法,如围绕中心点划分法(PAM),因其计算效率高且无需分布假设而广受欢迎。然而,当真正的聚类成员关系仅由少数微生物丰度差异驱动时,即所谓的稀疏信号情形,这些方法的性能可能并不理想。

结果

我们证明,经典多维尺度分析(MDS)这一广泛使用的降维技术,能有效去除微生物组数据中的噪声,并提升基于距离的方法的聚类性能。我们提出了一个两步程序,首先应用MDS将高维微生物组数据投影到低维空间,然后使用低维数据进行基于距离的聚类。我们广泛的模拟表明,在稀疏信号情形下,与直接进行基于距离的聚类相比,我们的程序具有更优的性能。我们的程序在几个实际数据应用中进一步展现了其优势。

可用性与实现

R包MDSMClust可在https://github.com/wxy929/MDS-project获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fa1/11814494/0da63821f545/btaf042f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验