Chen Yuhao, Zhang Yan, Gan Jiaqi, Ni Ke, Chen Ming, Bahar Ivet, Xing Jianhua
Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China.
Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
bioRxiv. 2025 Jan 11:2024.12.03.626638. doi: 10.1101/2024.12.03.626638.
RNA velocities and generalizations emerge as powerful approaches for extracting time-resolved information from high-throughput snapshot single-cell data. Yet, several inherent limitations restrict applying the approaches to genes not suitable for RNA velocity inference due to complex transcriptional dynamics, low expression, or lacking splicing dynamics, or data of non-transcriptomic modality. Here, we present GraphVelo, a graph-based machine learning procedure that uses as input the RNA velocities inferred from existing methods and infers velocity vectors lying in the tangent space of the low-dimensional manifold formed by the single cell data. GraphVelo preserves vector magnitude and direction information during transformations across different data representations. Tests on multiple synthetic and experimental scRNA-seq data including viral-host interactome and multi-omics datasets demonstrate that GraphVelo, together with downstream generalized dynamo analyses, extends RNA velocities to multi-modal data and reveals quantitative nonlinear regulation relations between genes, virus and host cells, and different layers of gene regulation.
RNA速度分析及其拓展方法已成为从高通量单细胞快照数据中提取时间分辨信息的强大手段。然而,由于复杂的转录动力学、低表达水平、缺乏剪接动力学,或非转录组学模态的数据等原因,一些内在限制使得这些方法不适用于某些基因的RNA速度推断。在此,我们提出了GraphVelo,这是一种基于图的机器学习方法,它以现有方法推断出的RNA速度作为输入,并推断位于由单细胞数据形成的低维流形切空间中的速度向量。GraphVelo在不同数据表示的转换过程中保留向量大小和方向信息。对多个合成和实验性scRNA-seq数据(包括病毒-宿主相互作用组和多组学数据集)的测试表明,GraphVelo与下游的广义动态分析一起,将RNA速度分析拓展到多模态数据,并揭示了基因、病毒与宿主细胞之间以及基因调控不同层面之间的定量非线性调控关系。