Suppr超能文献

基于庞加莱深度流形变换的单细胞数据复杂层次结构分析

Complex hierarchical structures analysis in single-cell data with Poincaré deep manifold transformation.

作者信息

Xu Yongjie, Zang Zelin, Hu Bozhen, Yuan Yue, Tan Cheng, Xia Jun, Li Stan Z

机构信息

School of Information Science & Electronic Engineering, Zhejiang University, No. 866 Yuhangtang Road, 310058 Zhejiang, P.R. China.

School of Engineering, Westlake University, No. 600 Dunyu Road, 310030 Zhejiang, P.R. China.

出版信息

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae687.

Abstract

Single-cell RNA sequencing (scRNA-seq) offers remarkable insights into cellular development and differentiation by capturing the gene expression profiles of individual cells. The role of dimensionality reduction and visualization in the interpretation of scRNA-seq data has gained widely acceptance. However, current methods face several challenges, including incomplete structure-preserving strategies and high distortion in embeddings, which fail to effectively model complex cell trajectories with multiple branches. To address these issues, we propose the Poincaré deep manifold transformation (PoincaréDMT) method, which maps high-dimensional scRNA-seq data to a hyperbolic Poincaré disk. This approach preserves global structure from a graph Laplacian matrix while achieving local structure correction through a structure module combined with data augmentation. Additionally, PoincaréDMT alleviates batch effects by integrating a batch graph that accounts for batch labels into the low-dimensional embeddings during network training. Furthermore, PoincaréDMT introduces the Shapley additive explanations method based on trained model to identify the important marker genes in specific clusters and cell differentiation process. Therefore, PoincaréDMT provides a unified framework for multiple key tasks essential for scRNA-seq analysis, including trajectory inference, pseudotime inference, batch correction, and marker gene selection. We validate PoincaréDMT through extensive evaluations on both simulated and real scRNA-seq datasets, demonstrating its superior performance in preserving global and local data structures compared to existing methods.

摘要

单细胞RNA测序(scRNA-seq)通过捕获单个细胞的基因表达谱,为细胞发育和分化提供了卓越的见解。降维和可视化在scRNA-seq数据解释中的作用已得到广泛认可。然而,目前的方法面临着几个挑战,包括结构保留策略不完整和嵌入中的高失真,这使得无法有效地对具有多个分支的复杂细胞轨迹进行建模。为了解决这些问题,我们提出了庞加莱深度流形变换(PoincaréDMT)方法,该方法将高维scRNA-seq数据映射到双曲庞加莱圆盘。这种方法从图拉普拉斯矩阵中保留全局结构,同时通过结合数据增强的结构模块实现局部结构校正。此外,PoincaréDMT通过在网络训练期间将考虑批次标签的批次图集成到低维嵌入中来减轻批次效应。此外,PoincaréDMT引入了基于训练模型的Shapley加法解释方法,以识别特定簇和细胞分化过程中的重要标记基因。因此,PoincaréDMT为scRNA-seq分析必不可少的多个关键任务提供了一个统一的框架,包括轨迹推断、伪时间推断、批次校正和标记基因选择。我们通过对模拟和真实scRNA-seq数据集的广泛评估来验证PoincaréDMT,证明其在保留全局和局部数据结构方面比现有方法具有优越的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ccba/11757945/b441a71c8ac4/bbae687f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验