Lo Kenneth, Hahne Florian, Brinkman Ryan R, Gottardo Raphael
Department of Statistics, University of British Columbia, 333-6356 Agricultural Road, Vancouver, BC, V6T1Z2, Canada.
BMC Bioinformatics. 2009 May 14;10:145. doi: 10.1186/1471-2105-10-145.
As a high-throughput technology that offers rapid quantification of multidimensional characteristics for millions of cells, flow cytometry (FCM) is widely used in health research, medical diagnosis and treatment, and vaccine development. Nevertheless, there is an increasing concern about the lack of appropriate software tools to provide an automated analysis platform to parallelize the high-throughput data-generation platform. Currently, to a large extent, FCM data analysis relies on the manual selection of sequential regions in 2-D graphical projections to extract the cell populations of interest. This is a time-consuming task that ignores the high-dimensionality of FCM data.
In view of the aforementioned issues, we have developed an R package called flowClust to automate FCM analysis. flowClust implements a robust model-based clustering approach based on multivariate t mixture models with the Box-Cox transformation. The package provides the functionality to identify cell populations whilst simultaneously handling the commonly encountered issues of outlier identification and data transformation. It offers various tools to summarize and visualize a wealth of features of the clustering results. In addition, to ensure its convenience of use, flowClust has been adapted for the current FCM data format, and integrated with existing Bioconductor packages dedicated to FCM analysis.
flowClust addresses the issue of a dearth of software that helps automate FCM analysis with a sound theoretical foundation. It tends to give reproducible results, and helps reduce the significant subjectivity and human time cost encountered in FCM analysis. The package contributes to the cytometry community by offering an efficient, automated analysis platform which facilitates the active, ongoing technological advancement.
作为一种能够对数百万个细胞的多维特征进行快速定量分析的高通量技术,流式细胞术(FCM)在健康研究、医学诊断与治疗以及疫苗开发中得到了广泛应用。然而,人们越来越担心缺乏合适的软件工具来提供一个自动化分析平台,以与高通量数据生成平台并行。目前,在很大程度上,FCM数据分析依赖于在二维图形投影中手动选择连续区域来提取感兴趣的细胞群体。这是一项耗时的任务,且忽略了FCM数据的高维性。
鉴于上述问题,我们开发了一个名为flowClust的R包来实现FCM分析的自动化。flowClust基于具有Box-Cox变换的多元t混合模型实现了一种稳健的基于模型的聚类方法。该包提供了识别细胞群体的功能,同时处理异常值识别和数据变换等常见问题。它提供了各种工具来总结和可视化聚类结果的丰富特征。此外,为确保其使用方便,flowClust已针对当前的FCM数据格式进行了调整,并与现有的专门用于FCM分析的Bioconductor包集成。
flowClust解决了缺乏有助于在坚实理论基础上实现FCM分析自动化的软件这一问题。它倾向于给出可重复的结果,并有助于减少FCM分析中遇到的显著主观性和人力时间成本。该包通过提供一个高效的自动化分析平台为细胞术领域做出了贡献,该平台促进了当前积极的技术进步。