Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, PR China; Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming 650050, PR China.
Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, PR China.
Methods. 2024 Dec;232:115-128. doi: 10.1016/j.ymeth.2024.11.001. Epub 2024 Nov 13.
Recent advancements in spatial transcriptomics sequencing technologies can not only provide gene expression within individual cells or cell clusters (spots) in a tissue but also pinpoint the exact location of this expression and generate detailed images of stained tissue sections, which offers invaluable insights into cell type identification and cell function exploration. However, effectively integratingthegene expression data, spatial location information, and tissue images from spatial transcriptomics data presents a significant challenge for computational methodsin cell classification. In this work, we propose MVCLST, a multi-view comparative learningmethod to analyze spatial transcriptomicsdata for accurate cell type classification. MVCLSTconstructs two views based on gene expression profiles, cell coordinates and image features. The multi-view method we proposed can significantly enhance the effectiveness of feature extraction while avoiding the impact of erroneous information in organizing image or gene expression data. The model employs four separate encoders to capture shared and unique features within each view. To ensure consistency and facilitate information exchange between the two views, MVCLST incorporates a contrastive learning loss function. The extracted shared and private features from both views are fused using corresponding decoders. Finally, the model utilizes the Leiden algorithm to clusterthe learned featuresfor cell type identification. Additionally, we establish a framework called MVCLST-CCFS for spatial transcriptomicsdata analysis based on MVCLST and consistent clustering. Our method achieves excellent results in clustering on human dorsolateral prefrontal cortex data and the mouse brain tissue data. Italso outperforms state-of-the-art techniques in the subsequent search for highly variable genes across cell types on the mouse olfactory bulbdata.
最近在空间转录组测序技术方面的进展不仅可以提供组织中单个细胞或细胞簇(斑点)内的基因表达情况,还可以精确定位这种表达的位置,并生成染色组织切片的详细图像,这为细胞类型鉴定和细胞功能探索提供了宝贵的见解。然而,有效地整合空间转录组数据中的基因表达数据、空间位置信息和组织图像,对计算方法在细胞分类中的应用提出了重大挑战。在这项工作中,我们提出了 MVCLST,一种用于分析空间转录组数据以实现准确细胞类型分类的多视图比较学习方法。MVCLST 基于基因表达谱、细胞坐标和图像特征构建了两个视图。我们提出的多视图方法可以显著提高特征提取的有效性,同时避免在组织图像或基因表达数据时错误信息的影响。该模型采用四个独立的编码器来捕获每个视图内的共享和独特特征。为了确保一致性并促进两个视图之间的信息交换,MVCLST 采用了对比学习损失函数。从两个视图中提取的共享和私有特征使用相应的解码器进行融合。最后,该模型利用 Leiden 算法对学习到的特征进行聚类,以识别细胞类型。此外,我们还基于 MVCLST 和一致聚类建立了一个称为 MVCLST-CCFS 的空间转录组数据分析框架。我们的方法在人类背外侧前额叶皮层数据和小鼠脑组织数据的聚类方面取得了优异的结果。它在随后在小鼠嗅球数据中寻找跨细胞类型的高度可变基因方面也优于最先进的技术。