Zhang Yingxi, Yu Zhuohan, Wong Ka-Chun, Li Xiangtao
School of Artificial Intelligence, Jilin University, Changchun, 130012, China.
Department of Computer Science, City University of Hong Kong, Hong Kong, 999077, Hong Kong SAR.
Bioinformatics. 2024 Jul 16;40(7). doi: 10.1093/bioinformatics/btae451.
Spatial transcriptomics can quantify gene expression and its spatial distribution in tissues, thus revealing molecular mechanisms of cellular interactions underlying tissue heterogeneity, tissue regeneration, and spatially localized disease mechanisms. However, existing spatial clustering methods often fail to exploit the full potential of spatial information, resulting in inaccurate identification of spatial domains.
In this paper, we develop a deep graph contrastive clustering framework, stDGCC, that accurately uncovers underlying spatial domains via explicitly modeling spatial information and gene expression profiles from spatial transcriptomics data. The stDGCC framework proposes a spatially informed graph node embedding model to preserve the topological information of spots and to learn the informative and discriminative characterization of spatial transcriptomics data through self-supervised contrastive learning. By simultaneously optimizing the contrastive learning loss, reconstruction loss, and Kullback-Leibler (KL) divergence loss, stDGCC achieves joint optimization of feature learning and topology structure preservation in an end-to-end manner. We validate the effectiveness of stDGCC on various spatial transcriptomics datasets acquired from different platforms, each with varying spatial resolutions. Our extensive experiments demonstrate the superiority of stDGCC over various state-of-the-art clustering methods in accurately identifying cellular-level biological structures.
Code and data are available from https://github.com/TimE9527/stDGCC and https://figshare.com/projects/stDGCC/186525.
Supplementary data are available at Bioinformatics online.
空间转录组学可以量化组织中的基因表达及其空间分布,从而揭示组织异质性、组织再生和空间定位疾病机制背后的细胞相互作用的分子机制。然而,现有的空间聚类方法往往无法充分利用空间信息的全部潜力,导致空间域的识别不准确。
在本文中,我们开发了一种深度图对比聚类框架stDGCC,通过显式建模空间转录组学数据中的空间信息和基因表达谱,准确地揭示潜在的空间域。stDGCC框架提出了一种空间信息图节点嵌入模型,以保留斑点的拓扑信息,并通过自监督对比学习来学习空间转录组学数据的信息性和判别性特征。通过同时优化对比学习损失、重建损失和Kullback-Leibler(KL)散度损失,stDGCC以端到端的方式实现了特征学习和拓扑结构保留的联合优化。我们在从不同平台获取的各种空间转录组学数据集上验证了stDGCC的有效性,每个数据集具有不同的空间分辨率。我们的广泛实验证明了stDGCC在准确识别细胞水平生物结构方面优于各种先进的聚类方法。
代码和数据可从https://github.com/TimE9527/stDGCC和https://figshare.com/projects/stDGCC/186525获得。
补充数据可在《生物信息学》在线获取。