School of Information Science and Engineering, Yunnan University, East Outer Ring Road, Chenggong District, Kunming 650500, Yunnan, China.
School of Information and Engineering, Zhongnan University of Economics and Law, 182 South Lake Avenue, East Lake New Technology Development Zone, Wuhan 430073, Hubei, China.
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae551.
In recent years, the advent of spatial transcriptomics (ST) technology has unlocked unprecedented opportunities for delving into the complexities of gene expression patterns within intricate biological systems. Despite its transformative potential, the prohibitive cost of ST technology remains a significant barrier to its widespread adoption in large-scale studies. An alternative, more cost-effective strategy involves employing artificial intelligence to predict gene expression levels using readily accessible whole-slide images stained with Hematoxylin and Eosin (H&E). However, existing methods have yet to fully capitalize on multimodal information provided by H&E images and ST data with spatial location. In this paper, we propose mclSTExp, a multimodal contrastive learning with Transformer and Densenet-121 encoder for Spatial Transcriptomics Expression prediction. We conceptualize each spot as a "word", integrating its intrinsic features with spatial context through the self-attention mechanism of a Transformer encoder. This integration is further enriched by incorporating image features via contrastive learning, thereby enhancing the predictive capability of our model. We conducted an extensive evaluation of highly variable genes in two breast cancer datasets and a skin squamous cell carcinoma dataset, and the results demonstrate that mclSTExp exhibits superior performance in predicting spatial gene expression. Moreover, mclSTExp has shown promise in interpreting cancer-specific overexpressed genes, elucidating immune-related genes, and identifying specialized spatial domains annotated by pathologists. Our source code is available at https://github.com/shizhiceng/mclSTExp.
近年来,空间转录组学(ST)技术的出现为深入研究复杂生物系统中的基因表达模式提供了前所未有的机会。尽管具有变革性的潜力,但 ST 技术的高昂成本仍然是其在大规模研究中广泛采用的一个重大障碍。一种替代的、更具成本效益的策略是利用人工智能根据易于获得的用苏木精和伊红(H&E)染色的全幻灯片图像来预测基因表达水平。然而,现有的方法尚未充分利用 H&E 图像和具有空间位置的 ST 数据提供的多模态信息。在本文中,我们提出了 mclSTExp,这是一种用于空间转录组表达预测的基于 Transformer 和 Densenet-121 编码器的多模态对比学习方法。我们将每个斑点视为一个“单词”,通过 Transformer 编码器的自注意力机制将其内在特征与空间上下文进行整合。通过对比学习进一步丰富了这种整合,从而增强了我们模型的预测能力。我们在两个乳腺癌数据集和一个皮肤鳞状细胞癌数据集上对高度可变基因进行了广泛评估,结果表明 mclSTExp 在预测空间基因表达方面表现出优异的性能。此外,mclSTExp 还在解释癌症特异性过表达基因、阐明免疫相关基因以及识别病理学家注释的专门空间域方面表现出了潜力。我们的源代码可在 https://github.com/shizhiceng/mclSTExp 上获得。