Physics Department, Bar-Ilan University, Ramat-Gan, Israel.
Department of Immunobiology, Howard Hughes Medical Institute, Yale University School of Medicine, New Haven, CT, USA.
Sci Rep. 2022 May 9;12(1):7547. doi: 10.1038/s41598-022-11507-y.
Genes are linked by underlying regulatory mechanisms and by jointly implementing biological functions, working in coordination to apply different tasks in the cells. Assessing the coordination level between genes from single-cell transcriptomic data, without a priori knowledge of the map of gene regulatory interactions, is a challenge. A 'top-down' approach has recently been developed to analyze single-cell transcriptomic data by evaluating the global coordination level between genes (called GCL). Here, we systematically analyze the performance of the GCL in typical scenarios of single-cell RNA sequencing (scRNA-seq) data. We show that an individual anomalous cell can have a disproportionate effect on the GCL calculated over a cohort of cells. In addition, we demonstrate how the GCL is affected by the presence of clusters, which are very common in scRNA-seq data. Finally, we analyze the effect of the sampling size of the Jackknife procedure on the GCL statistics. The manuscript is accompanied by a description of a custom-built Python package for calculating the GCL. These results provide practical guidelines for properly pre-processing and applying the GCL measure in transcriptional data.
基因通过潜在的调控机制和共同执行生物功能联系在一起,协同工作以在细胞中执行不同的任务。在没有基因调控相互作用图谱的先验知识的情况下,从单细胞转录组数据评估基因之间的协调水平是一项挑战。最近开发了一种“自上而下”的方法,通过评估基因之间的全局协调水平(称为 GCL)来分析单细胞转录组数据。在这里,我们系统地分析了 GCL 在单细胞 RNA 测序(scRNA-seq)数据的典型情况下的性能。我们表明,单个异常细胞可能对在细胞群体上计算的 GCL 产生不成比例的影响。此外,我们展示了 GCL 如何受到聚类的影响,聚类在 scRNA-seq 数据中非常常见。最后,我们分析了 Jackknife 过程的采样大小对 GCL 统计数据的影响。本文附有一个用于计算 GCL 的定制 Python 包的描述。这些结果为在转录数据中正确预处理和应用 GCL 度量提供了实用指南。