Academy of Statistics and Interdisciplinary Sciences, East China Normal University, Shanghai, 200062, China.
Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical School, 169857, Singapore.
Nucleic Acids Res. 2022 Jul 8;50(12):e72. doi: 10.1093/nar/gkac219.
Dimension reduction and (spatial) clustering is usually performed sequentially; however, the low-dimensional embeddings estimated in the dimension-reduction step may not be relevant to the class labels inferred in the clustering step. We therefore developed a computation method, Dimension-Reduction Spatial-Clustering (DR-SC), that can simultaneously perform dimension reduction and (spatial) clustering within a unified framework. Joint analysis by DR-SC produces accurate (spatial) clustering results and ensures the effective extraction of biologically informative low-dimensional features. DR-SC is applicable to spatial clustering in spatial transcriptomics that characterizes the spatial organization of the tissue by segregating it into multiple tissue structures. Here, DR-SC relies on a latent hidden Markov random field model to encourage the spatial smoothness of the detected spatial cluster boundaries. Underlying DR-SC is an efficient expectation-maximization algorithm based on an iterative conditional mode. As such, DR-SC is scalable to large sample sizes and can optimize the spatial smoothness parameter in a data-driven manner. With comprehensive simulations and real data applications, we show that DR-SC outperforms existing clustering and spatial clustering methods: it extracts more biologically relevant features than conventional dimension reduction methods, improves clustering performance, and offers improved trajectory inference and visualization for downstream trajectory inference analyses.
降维和(空间)聚类通常是顺序进行的;然而,在降维步骤中估计的低维嵌入可能与聚类步骤中推断的类标签不相关。因此,我们开发了一种计算方法,即降维空间聚类(DR-SC),可以在统一的框架内同时进行降维和(空间)聚类。DR-SC 的联合分析产生准确的(空间)聚类结果,并确保有效提取具有生物学意义的低维特征。DR-SC 适用于空间转录组学中的空间聚类,通过将组织分割成多个组织结构来描述组织的空间组织。在这里,DR-SC 依赖于潜在的隐马尔可夫随机场模型来鼓励检测到的空间聚类边界的空间平滑性。DR-SC 的基础是一种基于迭代条件模式的高效期望最大化算法。因此,DR-SC 可扩展到大数据量,并可以以数据驱动的方式优化空间平滑参数。通过全面的模拟和真实数据应用,我们表明 DR-SC 优于现有的聚类和空间聚类方法:它比传统的降维方法提取更多具有生物学意义的特征,提高了聚类性能,并为下游轨迹推断分析提供了改进的轨迹推断和可视化。