Department of Biomedical Engineering, Georgia Institute of Technology, Emory University, Atlanta, Georgia, USA.
J Comput Biol. 2023 Jul;30(7):738-750. doi: 10.1089/cmb.2022.0366. Epub 2023 Apr 22.
With rapid advances in single-cell profiling technologies, larger-scale investigations that require comparisons of multiple single-cell datasets can lead to novel findings. Specifically, quantifying cell-type-specific responses to different conditions across single-cell datasets could be useful in understanding how the difference in conditions is induced at a cellular level. In this study, we present a computational pipeline that quantifies cell-type-specific differences and identifies genes responsible for the differences. We quantify differences observed in a low-dimensional uniform manifold approximation and projection for dimension reduction space as a proxy for the difference present in the high-dimensional space and use SHapley Additive exPlanations to quantify genes driving the differences. In this study, we applied our algorithm to the Iris flower dataset, single-cell RNA sequencing dataset, and mass cytometry dataset and demonstrate that it can robustly quantify cell-type-specific differences and it can also identify genes that are responsible for the differences.
随着单细胞分析技术的快速发展,需要对多个单细胞数据集进行比较的更大规模研究可能会带来新的发现。具体来说,量化单细胞数据集中不同条件下的细胞类型特异性反应,有助于理解在细胞水平上条件差异是如何产生的。在这项研究中,我们提出了一种计算流程,用于量化细胞类型特异性差异并识别导致这些差异的基因。我们使用低维一致流形逼近和投影(UMAP)降维空间中的可观测差异作为高维空间中存在的差异的代理,并使用 SHapley Additive exPlanations 量化驱动差异的基因。在这项研究中,我们将我们的算法应用于鸢尾花数据集、单细胞 RNA 测序数据集和质谱细胞数据集,并证明它可以稳健地量化细胞类型特异性差异,还可以识别导致这些差异的基因。