Chan Cliburn, Feng Feng, Ottinger Janet, Foster David, West Mike, Kepler Thomas B
Center for Computational Immunology, Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina 27705, USA.
Cytometry A. 2008 Aug;73(8):693-701. doi: 10.1002/cyto.a.20583.
Statistical mixture modeling provides an opportunity for automated identification and resolution of cell subtypes in flow cytometric data. The configuration of cells as represented by multiple markers simultaneously can be modeled arbitrarily well as a mixture of Gaussian distributions in the dimension of the number of markers. Cellular subtypes may be related to one or multiple components of such mixtures, and fitted mixture models can be evaluated in the full set of markers as an alternative, or adjunct, to traditional subjective gating methods that rely on choosing one or two dimensions. Four color flow data from human blood cells labeled with FITC-conjugated anti-CD3, PE-conjugated anti-CD8, PE-Cy5-conjugated anti-CD4, and APC-conjugated anti-CD19 Abs was acquired on a FACSCalibur. Cells from four murine cell lines, JAWS II, RAW 264.7, CTLL-2, and A20, were also stained with FITC-conjugated anti-CD11c, PE-conjugated anti-CD11b, PE-Cy5-conjugated anti-CD8a, and PE-Cy7-conjugated-CD45R/B220 Abs, respectively, and single color flow data were collected on an LSRII. The data were fitted with a mixture of multivariate Gaussians using standard Bayesian statistical approaches and Markov chain Monte Carlo computations. Statistical mixture models were able to identify and purify major cell subsets in human peripheral blood, using an automated process that can be generalized to an arbitrary number of markers. Validation against both traditional expert gating and synthetic mixtures of murine cell lines with known mixing proportions was also performed. This article describes the studies of statistical mixture modeling of flow cytometric data, and demonstrates their utility in examples with four-color flow data from human peripheral blood samples and synthetic mixtures of murine cell lines.
统计混合模型为自动识别和解析流式细胞术数据中的细胞亚群提供了机会。由多个标记同时表示的细胞配置可以在标记数量维度上很好地建模为高斯分布的混合。细胞亚群可能与这种混合的一个或多个成分相关,并且拟合的混合模型可以在全套标记中进行评估,作为依赖于选择一两个维度的传统主观设门方法的替代或辅助方法。在FACSCalibur上采集了用异硫氰酸荧光素(FITC)偶联的抗CD3、藻红蛋白(PE)偶联的抗CD8、藻红蛋白-花青苷5(PE-Cy5)偶联的抗CD4和别藻蓝蛋白(APC)偶联的抗CD19抗体标记的人血细胞的四色流式数据。来自四种小鼠细胞系JAWS II、RAW 264.7、CTLL-2和A20的细胞也分别用FITC偶联的抗CD11c、PE偶联的抗CD11b、PE-Cy5偶联的抗CD8a和PE-Cy7偶联的CD45R/B220抗体进行染色,并在LSRII上收集单色流式数据。使用标准贝叶斯统计方法和马尔可夫链蒙特卡罗计算,将数据与多元高斯混合模型进行拟合。统计混合模型能够通过一个可推广到任意数量标记的自动化过程,识别和纯化人外周血中的主要细胞亚群。还针对传统专家设门和已知混合比例的小鼠细胞系合成混合物进行了验证。本文描述了流式细胞术数据的统计混合模型研究,并在来自人外周血样本的四色流式数据和小鼠细胞系合成混合物的示例中展示了它们的实用性。