Suppr超能文献

通过基因表达和数据驱动的基因-基因相互作用整合增强单细胞RNA测序嵌入

Enhanced single-cell RNA-seq embedding through gene expression and data-driven gene-gene interaction integration.

作者信息

Goudarzi Hojjat Torabi, Pouyan Maziyar Baran

机构信息

Electrical Engineering and Computer Science Department, Oregon State University, Address one, Corvallis, 97331, OR, United States.

Accenture Technology Labs, Address two, San Francisco, 94105, CA, United States.

出版信息

Comput Biol Med. 2025 Apr;188:109880. doi: 10.1016/j.compbiomed.2025.109880. Epub 2025 Feb 24.

Abstract

Single-cell RNA sequencing (scRNA-seq) provides unprecedented insights into cellular heterogeneity, enabling detailed analysis of complex biological systems at single-cell resolution. However, the high dimensionality and technical noise inherent in scRNA-seq data pose significant analytical challenges. While current embedding methods focus primarily on gene expression levels, they often overlook crucial gene-gene interactions that govern cellular identity and function. To address this limitation, we present a novel embedding approach that integrates both gene expression profiles and data-driven gene-gene interactions. Our method first constructs a Cell-Leaf Graph (CLG) using random forest models to capture regulatory relationships between genes, while simultaneously building a K-Nearest Neighbor Graph (KNNG) to represent expression similarities between cells. These graphs are then combined into an Enriched Cell-Leaf Graph (ECLG), which serves as input for a graph neural network to compute cell embeddings. By incorporating both expression levels and gene-gene interactions, our approach provides a more comprehensive representation of cellular states. Extensive evaluation across multiple datasets demonstrates that our method enhances the detection of rare cell populations and improves downstream analyses such as visualization, clustering, and trajectory inference. This integrated approach represents a significant advance in single-cell data analysis, offering a more complete framework for understanding cellular diversity and dynamics.

摘要

单细胞RNA测序(scRNA-seq)为细胞异质性提供了前所未有的见解,能够在单细胞分辨率下对复杂生物系统进行详细分析。然而,scRNA-seq数据固有的高维度和技术噪声带来了重大的分析挑战。虽然当前的嵌入方法主要关注基因表达水平,但它们往往忽略了决定细胞身份和功能的关键基因-基因相互作用。为了解决这一局限性,我们提出了一种新颖的嵌入方法,该方法整合了基因表达谱和数据驱动的基因-基因相互作用。我们的方法首先使用随机森林模型构建细胞-叶图(CLG)以捕获基因之间的调控关系,同时构建K近邻图(KNNG)以表示细胞之间的表达相似性。然后将这些图合并为一个富集细胞-叶图(ECLG),作为图神经网络计算细胞嵌入的输入。通过结合表达水平和基因-基因相互作用,我们的方法提供了细胞状态的更全面表示。对多个数据集的广泛评估表明,我们的方法增强了对稀有细胞群体的检测,并改善了下游分析,如可视化、聚类和轨迹推断。这种综合方法代表了单细胞数据分析的重大进展,为理解细胞多样性和动态提供了一个更完整的框架。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验