Suppr超能文献

基于图融合的单细胞RNA测序数据多视图聚类

Multi-view clustering for single-cell RNA-seq data based on graph fusion.

作者信息

Wang Jing, Xia Junfeng, Tan Dayu, Ma Yunjie, Su Yansen, Zheng Chun-Hou

机构信息

Anhui Provincial Key Laboratory of Multimodal Cognitive Computation, School of Artificial Intelligence, Anhui University, 111 Jiulong Road, Hefei, 230601, Anhui, China.

Institutes of Physical Science and Information Technology, Anhui University, 111 Jiulong Road, Hefei, 230601, Anhui, China.

出版信息

Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf193.

Abstract

Single-cell RNA sequencing (scRNA-seq) provides transcriptome profiling of individual cells, allowing for in-depth studies of cell heterogeneity at cell resolution. While cell clustering lays the basic foundation of scRNA-seq data analysis, the high-dimensionality and frequent dropout events of the data raise great challenges. Although plenty of dedicated clustering methods have been proposed, they often fail to fully explore the underlying data structure. Here, we introduce scMCGF, a new multi-view clustering algorithm based on graph fusion. It utilizes multi-view data generated from transcriptomic data to learn the consistent and complementary information across different view, ultimately constructing a unified graph matrix for robust cell clustering. Specifically, scMCGF utilizes two-dimensional-reduction methods (principal component analysis and diffusion maps) to capture both linear and non-linear characteristics of the data. Additionally, it calculates a cell-pathway score matrix to incorporate pathway-level information. These three features, along with the pre-processed gene expression data, form the multi-view data. scMCGF iteratively refines the structure of similarity graphs of each view through adaptive learning and learns a unified graph matrix by weighting and fusing the individual similarity graph matrix. The final clustering results are obtained by applying the rank constraint on the Laplacian matrix of the unified graph matrix. Experiments results of 13 real data sets reveal that scMCGF outperforms eight state-of-the-art methods in clustering accuracy and robustness. Furthermore, biological analysis validates that the clustering results of scMCGF provide a reliable foundation for downstream investigations.

摘要

单细胞RNA测序(scRNA-seq)可提供单个细胞的转录组分析,从而能够在细胞分辨率下深入研究细胞异质性。虽然细胞聚类为scRNA-seq数据分析奠定了基础,但数据的高维度和频繁的缺失事件带来了巨大挑战。尽管已经提出了许多专门的聚类方法,但它们往往无法充分探索潜在的数据结构。在此,我们介绍scMCGF,一种基于图融合的新型多视图聚类算法。它利用从转录组数据生成的多视图数据来学习不同视图间一致且互补的信息,最终构建一个统一的图矩阵用于稳健的细胞聚类。具体而言,scMCGF利用降维方法(主成分分析和扩散映射)来捕捉数据的线性和非线性特征。此外,它计算一个细胞-通路得分矩阵以纳入通路水平的信息。这三个特征与预处理后的基因表达数据一起构成了多视图数据。scMCGF通过自适应学习迭代地优化每个视图的相似性图结构,并通过对各个相似性图矩阵进行加权和融合来学习一个统一的图矩阵。最终的聚类结果是通过对统一图矩阵的拉普拉斯矩阵施加秩约束而获得的。13个真实数据集的实验结果表明,scMCGF在聚类准确性和稳健性方面优于8种最先进的方法。此外,生物学分析证实scMCGF的聚类结果为下游研究提供了可靠的基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fecb/12103903/6a712464b91d/bbaf193f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验