Deng Tao, Huang Mengqian, Xu Kaichen, Lu Yan, Xu Yucheng, Chen Siyu, Xie Nina, Tao Qiuyue, Wu Hao, Sun Xiaobo
School of Data Science, The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), Shenzhen 518172, China.
Shenzhen Research Institute of Big Data, Shenzhen 518172, China.
Genomics Proteomics Bioinformatics. 2025 Jul 1. doi: 10.1093/gpbjnl/qzaf056.
Identifying co-expressed genes across tissue domains and cell types is essential for revealing co-functional genes involved in biological or pathological processes. While both single-cell RNA sequencing (scRNA-seq) and spatially resolved transcriptomic (SRT) data offer insights into gene co-expression patterns, current methods typically utilize either data type alone, potentially diluting the co-functionality signals within co-expressed gene groups. To bridge this gap, we introduce muLtimodal co-Expressed GENes finDer (LEGEND), a novel computational method that integrates scRNA-seq and SRT data for identifying groups of co-expressed genes at both cell type and tissue domain levels. LEGEND employs an innovative hierarchical clustering algorithm designed to maximize intra-cluster redundancy and inter-cluster complementarity, effectively capturing more nuanced patterns of gene co-expression and spatial coherence. Enrichment and co-function analyses further showcase the biological relevance of these gene clusters, and their utilities in exploring context-specific novel gene functions. Notably, LEGEND can reveal shifts in gene-gene interactions under different conditions, furnishing insights for disease-associated gene crosstalk. Moreover, LEGEND can be utilized to enhance the annotation accuracy of both spatial spots in SRT and single cells in scRNA-seq, and pioneers in identifying genes with designated spatial expression patterns. LEGEND is available at https://github.com/ToryDeng/LEGEND.
识别跨组织域和细胞类型的共表达基因对于揭示参与生物或病理过程的协同功能基因至关重要。虽然单细胞RNA测序(scRNA-seq)和空间分辨转录组学(SRT)数据都能提供有关基因共表达模式的见解,但目前的方法通常仅使用其中一种数据类型,这可能会稀释共表达基因组中的协同功能信号。为了弥补这一差距,我们引入了多模态共表达基因发现器(LEGEND),这是一种新颖的计算方法,它整合了scRNA-seq和SRT数据,用于在细胞类型和组织域水平上识别共表达基因群体。LEGEND采用了一种创新的层次聚类算法,旨在最大化簇内冗余和簇间互补性,有效地捕捉基因共表达和空间一致性的更细微模式。富集和协同功能分析进一步展示了这些基因簇的生物学相关性,以及它们在探索特定背景下新基因功能方面的效用。值得注意的是,LEGEND可以揭示不同条件下基因-基因相互作用的变化,为疾病相关基因串扰提供见解。此外,LEGEND可用于提高SRT中空间斑点和scRNA-seq中单个细胞的注释准确性,并率先识别具有指定空间表达模式的基因。LEGEND可在https://github.com/ToryDeng/LEGEND上获取。