Senra Daniela, Guisoni Nara, Diambra Luis
Centro Regional de Estudios Genómicos, Universidad Nacional de La Plata, Argentina.
MethodsX. 2022 Jul 1;9:101778. doi: 10.1016/j.mex.2022.101778. eCollection 2022.
Trajectory inference is a common application of scRNA-seq data. However, it is often necessary to previously determine the origin of the trajectories, the stem or progenitor cells. In this work, we propose a computational tool to quantify pluripotency from single cell transcriptomics data. This approach uses the protein-protein interaction (PPI) network associated with the differentiation process as a scaffold and the gene expression matrix to calculate a score that we call differentiation activity. This score reflects how active the differentiation network is in each cell. We benchmark the performance of our algorithm with two previously published tools, LandSCENT (Chen et al., 2019) and CytoTRACE (Gulati et al., 2020), for four healthy human data sets: breast, colon, hematopoietic and lung. We show that our algorithm is more efficient than LandSCENT and requires less RAM memory than the other programs. We also illustrate a complete workflow from the count matrix to trajectory inference using the breast data set.•ORIGINS is a methodology to quantify pluripotency from scRNA-seq data implemented as a freely available R package.•ORIGINS uses the protein-protein interaction network associated with differentiation and the data set expression matrix to calculate a score (differentiation activity) that quantifies pluripotency for each cell.
轨迹推断是单细胞RNA测序(scRNA-seq)数据的常见应用。然而,通常有必要事先确定轨迹的起源,即干细胞或祖细胞。在这项工作中,我们提出了一种计算工具,用于从单细胞转录组学数据中量化多能性。该方法以与分化过程相关的蛋白质-蛋白质相互作用(PPI)网络为框架,并利用基因表达矩阵来计算一个我们称为分化活性的分数。这个分数反映了分化网络在每个细胞中的活跃程度。我们用之前发表的两个工具LandSCENT(Chen等人,2019年)和CytoTRACE(Gulati等人,2020年),对四个健康人类数据集(乳腺、结肠、造血和肺)的算法性能进行了基准测试。我们表明,我们的算法比LandSCENT更高效,并且比其他程序需要更少的随机存取存储器(RAM)内存。我们还使用乳腺数据集展示了从计数矩阵到轨迹推断的完整工作流程。
•ORIGINS是一种从scRNA-seq数据中量化多能性的方法,以一个免费的R包形式实现。
•ORIGINS利用与分化相关的蛋白质-蛋白质相互作用网络和数据集表达矩阵来计算一个分数(分化活性),该分数量化了每个细胞的多能性。