School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China.
Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China.
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac297.
The rapid development of spatial transcriptomics allows the measurement of RNA abundance at a high spatial resolution, making it possible to simultaneously profile gene expression, spatial locations of cells or spots, and the corresponding hematoxylin and eosin-stained histology images. It turns promising to predict gene expression from histology images that are relatively easy and cheap to obtain. For this purpose, several methods are devised, but they have not fully captured the internal relations of the 2D vision features or spatial dependency between spots. Here, we developed Hist2ST, a deep learning-based model to predict RNA-seq expression from histology images. Around each sequenced spot, the corresponding histology image is cropped into an image patch and fed into a convolutional module to extract 2D vision features. Meanwhile, the spatial relations with the whole image and neighbored patches are captured through Transformer and graph neural network modules, respectively. These learned features are then used to predict the gene expression by following the zero-inflated negative binomial distribution. To alleviate the impact by the small spatial transcriptomics data, a self-distillation mechanism is employed for efficient learning of the model. By comprehensive tests on cancer and normal datasets, Hist2ST was shown to outperform existing methods in terms of both gene expression prediction and spatial region identification. Further pathway analyses indicated that our model could reserve biological information. Thus, Hist2ST enables generating spatial transcriptomics data from histology images for elucidating molecular signatures of tissues.
空间转录组学的快速发展使得能够以高空间分辨率测量 RNA 丰度,从而能够同时分析基因表达、细胞或斑点的空间位置以及相应的苏木精和伊红染色组织学图像。有望根据相对容易且廉价获得的组织学图像来预测基因表达。为此,设计了几种方法,但它们尚未充分捕捉到 2D 视觉特征或斑点之间的空间依赖性的内在关系。在这里,我们开发了基于深度学习的模型 Hist2ST,用于从组织学图像预测 RNA-seq 表达。在每个测序斑点周围,将相应的组织学图像裁剪成图像补丁,并将其输入卷积模块以提取 2D 视觉特征。同时,通过 Transformer 和图神经网络模块分别捕获与整个图像和相邻补丁的空间关系。然后,通过以下零膨胀负二项式分布,这些学习到的特征用于预测基因表达。为了减轻小空间转录组学数据的影响,采用自蒸馏机制来有效地学习模型。通过对癌症和正常数据集的综合测试,Hist2ST 在基因表达预测和空间区域识别方面均优于现有方法。进一步的通路分析表明,我们的模型可以保留生物学信息。因此,Hist2ST 能够从组织学图像生成空间转录组学数据,以阐明组织的分子特征。