Lu Qiaolin, Ding Jiayuan, Li Lingxiao, Chang Yi
School of Artificial Intelligence, Jilin University, Qianjin Street 2699, 130010 Changchun, China.
Department of Computer Science and Engineering, Michigan State University, 220 Trowbridge Rd, East Lansing, MI 48824, United States.
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf020.
Imaging-based spatial transcriptomics (iST), such as MERFISH, CosMx SMI, and Xenium, quantify gene expression level across cells in space, but more importantly, they directly reveal the subcellular distribution of RNA transcripts at the single-molecule resolution. The subcellular localization of RNA molecules plays a crucial role in the compartmentalization-dependent regulation of genes within individual cells. Understanding the intracellular spatial distribution of RNA for a particular cell type thus not only improves the characterization of cell identity but also is of paramount importance in elucidating unique subcellular regulatory mechanisms specific to the cell type. However, current cell type annotation approaches of iST primarily utilize gene expression information while neglecting the spatial distribution of RNAs within cells. In this work, we introduce a semi-supervised graph contrastive learning method called Focus, the first method, to the best of our knowledge, that explicitly models RNA's subcellular distribution and community to improve cell type annotation. Focus demonstrates significant improvements over state-of-the-art algorithms across a range of spatial transcriptomics platforms, achieving improvements up to 27.8% in terms of accuracy and 51.9% in terms of F1-score for cell type annotation. Furthermore, Focus enjoys the advantages of intricate cell type-specific subcellular spatial gene patterns and providing interpretable subcellular gene analysis, such as defining the gene importance score. Importantly, with the importance score, Focus identifies genes harboring strong relevance to cell type-specific pathways, indicating its potential in uncovering novel regulatory programs across numerous biological systems.
基于成像的空间转录组学(iST),如MERFISH、CosMx SMI和Xenium,可在空间上对细胞间的基因表达水平进行量化,但更重要的是,它们能以单分子分辨率直接揭示RNA转录本的亚细胞分布。RNA分子的亚细胞定位在单个细胞内基因的区室化依赖性调控中起着关键作用。因此,了解特定细胞类型中RNA的细胞内空间分布不仅能改善细胞身份的表征,而且对于阐明该细胞类型特有的独特亚细胞调控机制也至关重要。然而,目前iST的细胞类型注释方法主要利用基因表达信息,而忽略了细胞内RNA的空间分布。在这项工作中,我们引入了一种名为Focus的半监督图对比学习方法,据我们所知,这是第一种明确对RNA的亚细胞分布和群落进行建模以改善细胞类型注释的方法。Focus在一系列空间转录组学平台上比现有算法有显著改进,在细胞类型注释的准确率方面提高了27.8%,F1分数方面提高了51.9%。此外,Focus具有复杂的细胞类型特异性亚细胞空间基因模式的优势,并能提供可解释的亚细胞基因分析,如定义基因重要性得分。重要的是,通过重要性得分,Focus识别出与细胞类型特异性途径密切相关的基因,表明其在揭示众多生物系统中新型调控程序方面的潜力。