College of Sciences, Inner Mongolia University of Technology, Hohhot 010051, China.
State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China.
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae259.
The advancement of spatial transcriptomics (ST) technology contributes to a more profound comprehension of the spatial properties of gene expression within tissues. However, due to challenges of high dimensionality, pronounced noise and dynamic limitations in ST data, the integration of gene expression and spatial information to accurately identify spatial domains remains challenging. This paper proposes a SpaNCMG algorithm for the purpose of achieving precise spatial domain description and localization based on a neighborhood-complementary mixed-view graph convolutional network. The algorithm enables better adaptation to ST data at different resolutions by integrating the local information from KNN and the global structure from r-radius into a complementary neighborhood graph. It also introduces an attention mechanism to achieve adaptive fusion of different reconstructed expressions, and utilizes KPCA method for dimensionality reduction. The application of SpaNCMG on five datasets from four sequencing platforms demonstrates superior performance to eight existing advanced methods. Specifically, the algorithm achieved highest ARI accuracies of 0.63 and 0.52 on the datasets of the human dorsolateral prefrontal cortex and mouse somatosensory cortex, respectively. It accurately identified the spatial locations of marker genes in the mouse olfactory bulb tissue and inferred the biological functions of different regions. When handling larger datasets such as mouse embryos, the SpaNCMG not only identified the main tissue structures but also explored unlabeled domains. Overall, the good generalization ability and scalability of SpaNCMG make it an outstanding tool for understanding tissue structure and disease mechanisms. Our codes are available at https://github.com/ZhihaoSi/SpaNCMG.
空间转录组学(ST)技术的进步有助于更深入地理解组织中基因表达的空间特性。然而,由于 ST 数据具有高维性、显著噪声和动态限制等挑战,将基因表达和空间信息进行整合以准确识别空间域仍然具有挑战性。本文提出了一种 SpaNCMG 算法,该算法基于邻域互补混合视图图卷积网络,旨在实现基于基因表达和空间信息的精确空间域描述和定位。该算法通过将 KNN 的局部信息和 r 半径的全局结构整合到互补邻域图中,从而更好地适应不同分辨率的 ST 数据。它还引入了注意力机制,实现了不同重构表达的自适应融合,并利用 KPCA 方法进行降维。在来自四个测序平台的五个数据集上的应用表明,该算法的性能优于现有的八种先进方法。具体来说,该算法在人类背外侧前额叶皮层和小鼠体感皮层数据集上分别实现了 0.63 和 0.52 的最高 ARI 准确度。它准确地识别了小鼠嗅球组织中标记基因的空间位置,并推断了不同区域的生物学功能。当处理更大的数据集(如小鼠胚胎)时,SpaNCMG 不仅能够识别主要组织结构,还能够探索未标记的区域。总体而言,SpaNCMG 具有良好的泛化能力和可扩展性,使其成为理解组织结构和疾病机制的出色工具。我们的代码可在 https://github.com/ZhihaoSi/SpaNCMG 上获取。