Department of Statistics and Probability, Michigan State University, East Lansing, Michigan, United States of America.
Department of Mathematics, University of Utah, Salt Lake City, Utah, United States of America.
PLoS Comput Biol. 2024 May 29;20(5):e1012014. doi: 10.1371/journal.pcbi.1012014. eCollection 2024 May.
Recent advances in single-cell technologies have enabled high-resolution characterization of tissue and cancer compositions. Although numerous tools for dimension reduction and clustering are available for single-cell data analyses, these methods often fail to simultaneously preserve local cluster structure and global data geometry. To address these challenges, we developed a novel analyses framework, Single-Cell Path Metrics Profiling (scPMP), using power-weighted path metrics, which measure distances between cells in a data-driven way. Unlike Euclidean distance and other commonly used distance metrics, path metrics are density sensitive and respect the underlying data geometry. By combining path metrics with multidimensional scaling, a low dimensional embedding of the data is obtained which preserves both the global data geometry and cluster structure. We evaluate the method both for clustering quality and geometric fidelity, and it outperforms current scRNAseq clustering algorithms on a wide range of benchmarking data sets.
单细胞技术的最新进展使得对组织和癌症成分进行高分辨率特征描述成为可能。尽管有许多用于单细胞数据分析的降维和聚类工具,但这些方法往往不能同时保留局部簇结构和全局数据几何形状。为了解决这些挑战,我们开发了一种新的分析框架,即单细胞路径度量分析(scPMP),该框架使用幂加权路径度量,以数据驱动的方式测量细胞之间的距离。与欧几里得距离和其他常用距离度量不同,路径度量对密度敏感,并且尊重底层数据几何形状。通过将路径度量与多维尺度分析相结合,可以获得一个既能保留全局数据几何形状又能保留簇结构的低维数据嵌入。我们从聚类质量和几何保真度两个方面对该方法进行了评估,它在广泛的基准数据集上优于当前的 scRNAseq 聚类算法。