Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland.
Applied Mathematics, Applied Statistics, and Scientific Computing, University of Maryland, College Park, Maryland.
Cytometry A. 2016 Jan;89(1):71-88. doi: 10.1002/cyto.a.22735. Epub 2015 Aug 14.
Flow cytometry (FCM) is a fluorescence-based single-cell experimental technology that is routinely applied in biomedical research for identifying cellular biomarkers of normal physiological responses and abnormal disease states. While many computational methods have been developed that focus on identifying cell populations in individual FCM samples, very few have addressed how the identified cell populations can be matched across samples for comparative analysis. This article presents FlowMap-FR, a novel method for cell population mapping across FCM samples. FlowMap-FR is based on the Friedman-Rafsky nonparametric test statistic (FR statistic), which quantifies the equivalence of multivariate distributions. As applied to FCM data by FlowMap-FR, the FR statistic objectively quantifies the similarity between cell populations based on the shapes, sizes, and positions of fluorescence data distributions in the multidimensional feature space. To test and evaluate the performance of FlowMap-FR, we simulated the kinds of biological and technical sample variations that are commonly observed in FCM data. The results show that FlowMap-FR is able to effectively identify equivalent cell populations between samples under scenarios of proportion differences and modest position shifts. As a statistical test, FlowMap-FR can be used to determine whether the expression of a cellular marker is statistically different between two cell populations, suggesting candidates for new cellular phenotypes by providing an objective statistical measure. In addition, FlowMap-FR can indicate situations in which inappropriate splitting or merging of cell populations has occurred during gating procedures. We compared the FR statistic with the symmetric version of Kullback-Leibler divergence measure used in a previous population matching method with both simulated and real data. The FR statistic outperforms the symmetric version of KL-distance in distinguishing equivalent from nonequivalent cell populations. FlowMap-FR was also employed as a distance metric to match cell populations delineated by manual gating across 30 FCM samples from a benchmark FlowCAP data set. An F-measure of 0.88 was obtained, indicating high precision and recall of the FR-based population matching results. FlowMap-FR has been implemented as a standalone R/Bioconductor package so that it can be easily incorporated into current FCM data analytical workflows.
流式细胞术(FCM)是一种基于荧光的单细胞实验技术,常用于生物医学研究中鉴定正常生理反应和异常疾病状态的细胞生物标志物。虽然已经开发了许多专注于识别单个 FCM 样本中细胞群体的计算方法,但很少有方法解决如何在样本之间匹配鉴定出的细胞群体以进行比较分析。本文介绍了 FlowMap-FR,这是一种用于 FCM 样本之间细胞群体映射的新方法。FlowMap-FR 基于 Friedman-Rafsky 非参数检验统计量(FR 统计量),该统计量量化了多元分布的等效性。在应用于 FCM 数据时,FR 统计量客观地基于多维特征空间中荧光数据分布的形状、大小和位置来量化细胞群体之间的相似性。为了测试和评估 FlowMap-FR 的性能,我们模拟了 FCM 数据中常见的生物学和技术样本变化。结果表明,FlowMap-FR 能够在比例差异和适度位置偏移的情况下有效地识别样本之间等效的细胞群体。作为一种统计检验,FlowMap-FR 可以用于确定两个细胞群体之间细胞标志物的表达是否存在统计学差异,通过提供客观的统计度量,提示新的细胞表型候选。此外,FlowMap-FR 可以指示在门控过程中细胞群体的不当分裂或合并情况。我们将 FR 统计量与以前用于群体匹配方法的对称版 Kullback-Leibler 散度度量进行了比较,分别使用模拟和真实数据进行了比较。FR 统计量在区分等效和非等效细胞群体方面优于对称版 KL 距离。FlowMap-FR 还被用作距离度量标准,用于匹配 30 个来自基准 FlowCAP 数据集的 FCM 样本中手动门控划分的细胞群体。获得了 0.88 的 F 度量,表明基于 FR 的群体匹配结果具有较高的精度和召回率。FlowMap-FR 已作为一个独立的 R/Bioconductor 包实现,以便于将其集成到当前的 FCM 数据分析工作流程中。