Wang Haiyan, Zhang Yanping, Zhang Yangyang, Zhao Xuening, Bai Zijia, Ma Xuejing, Zhao Chunguang
School of Mathematics and Physics, Hebei University of Engineering, Handan, China.
School of Mathematics and Physics, Handan University, Handan, China.
PLoS One. 2025 Aug 14;20(8):e0329122. doi: 10.1371/journal.pone.0329122. eCollection 2025.
Spatial transcriptomics has revolutionized the analysis of gene expression while preserving tissue spatial information, which provides novel insights into the cellular composition and function of complex biological tissues. However, current technologies are constrained by limited resolution and data sparsity, compromising the accuracy of downstream analyses. To address these challenges, we developed SpaVGN, a deep learning framework integrating convolutional neural networks, vision transformer, and graph neural networks for high-fidelity gene expression imputation and spatial domain identification. By combining local feature extraction, global attention mechanisms, and spatial graph-based modeling, SpaVGN effectively reconstructs missing transcriptomic data while preserving spatial tissue architecture. Evaluated on melanoma and sagittal posterior mouse brain datasets, SpaVGN outperformed existing methods in gene expression prediction, achieving Pearson correlation coefficients of 0.609 (melanoma) and 0.682 (mouse brain). It clearly delineated tumor regions and lymphoid niches in melanoma tissue, achieving fine-grained resolution of hippocampal subfields, including Cornu Ammonis and Dentate Gyrus, with a Silhouette Score of 0.43 and a Davies-Bouldin Index of 0.86. Validation through UMAP dimensionality reduction and PAGA network analysis demonstrated that SpaVGN significantly mitigates the negative impact of data sparsity in spatial transcriptomics, improving data completeness and spatial continuity. This study presents an innovative solution that enhances the resolution of spatial transcriptomics data, offering cross-tissue applicability and providing a valuable tool for research in biological development, disease, and tumor heterogeneity.
空间转录组学在保留组织空间信息的同时,彻底改变了基因表达分析,为复杂生物组织的细胞组成和功能提供了新的见解。然而,目前的技术受到分辨率有限和数据稀疏性的限制,影响了下游分析的准确性。为了应对这些挑战,我们开发了SpaVGN,这是一个深度学习框架,集成了卷积神经网络、视觉Transformer和图神经网络,用于高保真基因表达插补和空间域识别。通过结合局部特征提取、全局注意力机制和基于空间图的建模,SpaVGN有效地重建了缺失的转录组数据,同时保留了空间组织结构。在黑色素瘤和小鼠脑矢状后数据集上进行评估,SpaVGN在基因表达预测方面优于现有方法,在黑色素瘤中皮尔逊相关系数达到0.609,在小鼠脑中达到0.682。它清晰地描绘了黑色素瘤组织中的肿瘤区域和淋巴生态位,实现了海马亚区的细粒度分辨率,包括海马角和齿状回,轮廓系数为0.43,戴维斯-布尔丁指数为0.86。通过UMAP降维和PAGA网络分析进行验证表明,SpaVGN显著减轻了空间转录组学中数据稀疏性的负面影响,提高了数据完整性和空间连续性。这项研究提出了一种创新解决方案,提高了空间转录组学数据的分辨率,具有跨组织适用性,为生物发育、疾病和肿瘤异质性研究提供了有价值的工具。