Zhang Wei, Zhang Ziqi, Yang Hailong, Zhang Te, Jiang Shu, Qiao Ning, Deng Zhaohong, Pan Xiaoyong, Shen Hong-Bin, Yu Dong-Jun, Wang Shitong
The School of Artificial Intelligence and Computer Science, Nantong University, Nantong, 226019, China.
The School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214122, China.
Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf221.
Spatial clustering is a key analytical technique for exploring spatial transcriptomics data. Recent graph neural network-based methods have shown promise in spatial clustering but face notable challenges. One significant issue is that analyzing the functions and complex mechanisms of organisms from a single scale is difficult and most methods focus exclusively on the single-scale representation of transcriptomic data, potentially limiting the discriminative power of extracted features for spatial domain clustering. Furthermore, classical clustering algorithms are often applied directly to latent representation, making it a worthwhile endeavor to explore a tailored clustering method to further improve the accuracy of spatial domain annotation.
To address these limitations, we propose m2ST, a novel dual multi-scale graph clustering method. m2ST first uses a multi-scale masked graph autoencoder to extract representations across different scales from spatial transcriptomic data. To effectively compress and distill meaningful knowledge embedded in the data, m2ST introduces a random masking mechanism for node features and uses a scaled cosine error as the loss function. Additionally, we introduce a tailored multi-scale clustering framework that integrates scale-common and scale-specific information exploration into the clustering process, achieving more robust annotation performance. Shannon entropy is finally utilized to dynamically adjust the importance of different scales. Extensive experiments on multiple spatial transcriptomic datasets demonstrate the superior performance of m2ST compared to existing methods.
空间聚类是探索空间转录组学数据的关键分析技术。最近基于图神经网络的方法在空间聚类方面显示出了前景,但也面临着显著挑战。一个重要问题是,从单一尺度分析生物体的功能和复杂机制很困难,而且大多数方法只专注于转录组数据的单尺度表示,这可能会限制提取特征对空间域聚类的判别能力。此外,经典聚类算法通常直接应用于潜在表示,因此探索一种量身定制的聚类方法以进一步提高空间域注释的准确性是一项值得努力的工作。
为了解决这些限制,我们提出了m2ST,一种新颖的双多尺度图聚类方法。m2ST首先使用多尺度掩码图自动编码器从空间转录组数据中提取不同尺度的表示。为了有效地压缩和提炼数据中嵌入的有意义知识,m2ST引入了一种用于节点特征的随机掩码机制,并使用缩放余弦误差作为损失函数。此外,我们引入了一个量身定制的多尺度聚类框架,该框架将尺度通用和尺度特定的信息探索集成到聚类过程中,实现了更稳健的注释性能。最后利用香农熵动态调整不同尺度的重要性。在多个空间转录组数据集上进行的大量实验表明,与现有方法相比,m2ST具有卓越的性能。