Aghaeepour Nima, Brinkman Ryan
Terry Fox Laboratory, BC Cancer Agency, 675 West 10th Avenue, Vancouver BC, V5Z 1L3, Canada.
Curr Top Microbiol Immunol. 2014;377:159-75. doi: 10.1007/82_2013_337.
Recent technological advancements have enabled the flow cytometric measurement of tens of parameters on millions of cells. Conventional manual data analysis and bioinformatics tools cannot provide a complete analysis of these datasets due to this complexity. In this chapter we will provide an overview of a general data analysis pipeline both for automatic identification of cell populations of known importance (e.g., diagnosis by identification of predefined cell population) and for exploratory analysis of cohorts of flow cytometry assays (e.g., discovery of new correlates of a malignancy). We provide three real-world examples of how unsupervised discovery has been used in basic and clinical research. We also discuss challenges for evaluation of the algorithms developed for (1) identification of cell populations using clustering, (2) identification of specific cell populations, and (3) supervised analysis for discriminating between patient subgroups.
最近的技术进步使得能够对数百万个细胞进行数十个参数的流式细胞术测量。由于这种复杂性,传统的手动数据分析和生物信息学工具无法对这些数据集进行完整分析。在本章中,我们将概述一个通用的数据分析流程,该流程既用于自动识别已知重要性的细胞群体(例如,通过识别预定义的细胞群体进行诊断),也用于对流式细胞术检测队列进行探索性分析(例如,发现恶性肿瘤的新关联因素)。我们提供了三个实际例子,说明无监督发现如何在基础研究和临床研究中得到应用。我们还讨论了评估为以下方面开发的算法所面临的挑战:(1) 使用聚类识别细胞群体,(2) 识别特定细胞群体,以及(3) 用于区分患者亚组的监督分析。