Wang Yuequn, Wang Jun, Xu Yanyu, Liu Ning, Liu Bin, Li Yuliang, Yu Guoxian
School of Software, Shandong University, Jinan 250101, Shandong, China.
SDU-NTU Centre for Artificial Intelligence Research, Shandong University, Jinan 250101, Shandong, China.
Nucleic Acids Res. 2025 Sep 5;53(17). doi: 10.1093/nar/gkaf865.
Spatial transcriptomics (ST) reveals gene expression distributions within tissues. Yet, predicting spatial gene expression from histological images still faces the challenges of limited ST data that lack prior knowledge, and insufficient capturing of inter-slice heterogeneity and intra-slice complexity. To tackle these challenges, we introduce FmH2ST, a foundation model-based method for spatial gene expression prediction. Equipped with powerful foundation models pretrained on massive images, FmH2ST employs a dual-branch framework to integrate prior knowledge from foundation model and fine-grained details from spot images. The foundation model branch employs a multilevel feature extraction strategy to obtain enriched features with slice context for capturing inter-slice heterogeneity, and a dual-graph strategy to obtain spatial and semantic enriched features for modeling intra-slice complexity. The spot-specific learning branch leverages multiscale convolutions, Transformer, and graph neural network to extract fine-grained spot features. The outputs of two branches are adaptively fused for better prediction under a collaborative branch learning strategy. Experimental results show FmH2ST outperforms state-of-the-art methods on benchmark datasets. FmH2ST can denoise the raw gene expressions, reveal cancer spatial heterogeneity and gene co-expression patterns, and support the inference of gene regulatory networks. Overall, FmH2ST is effective for predicting ST, with potential applications in clinical diagnostics and personalized treatment.
空间转录组学(ST)揭示了组织内的基因表达分布。然而,从组织学图像预测空间基因表达仍然面临挑战,即缺乏先验知识的有限ST数据,以及对切片间异质性和切片内复杂性的捕捉不足。为了应对这些挑战,我们引入了FmH2ST,一种基于基础模型的空间基因表达预测方法。FmH2ST配备了在大量图像上预训练的强大基础模型,采用双分支框架来整合来自基础模型的先验知识和来自斑点图像的细粒度细节。基础模型分支采用多级特征提取策略,以获得带有切片上下文的丰富特征来捕捉切片间异质性,以及双图策略来获得空间和语义丰富的特征以对切片内复杂性进行建模。斑点特异性学习分支利用多尺度卷积、Transformer和图神经网络来提取细粒度的斑点特征。在协作分支学习策略下,两个分支的输出被自适应融合以进行更好的预测。实验结果表明,FmH2ST在基准数据集上优于现有方法。FmH2ST可以对原始基因表达进行去噪,揭示癌症空间异质性和基因共表达模式,并支持基因调控网络的推断。总体而言,FmH2ST在预测ST方面是有效的,在临床诊断和个性化治疗中具有潜在应用。