Suppr超能文献

质谱流式细胞术数据分析中的聚类稳定性

Cluster stability in the analysis of mass cytometry data.

作者信息

Melchiotti Rossella, Gracio Filipe, Kordasti Shahram, Todd Alan K, de Rinaldis Emanuele

机构信息

Guy's and St Thomas' NHS Foundation Trust and King's College London, Translational Bioinformatics Platform - R&D Department. Biomedical Research Centre, London, SE1 9RT, United Kingdom.

Department of Haematological Medicine, Cancer Studies Division King's College London, Rayne Institute, London, SE5 9NU, United Kingdom.

出版信息

Cytometry A. 2017 Jan;91(1):73-84. doi: 10.1002/cyto.a.23001. Epub 2016 Oct 18.

Abstract

Manual gating has been traditionally applied to cytometry data sets to identify cells based on protein expression. The advent of mass cytometry allows for a higher number of proteins to be simultaneously measured on cells, therefore providing a means to define cell clusters in a high dimensional expression space. This enhancement, whilst opening unprecedented opportunities for single cell-level analyses, makes the incremental replacement of manual gating with automated clustering a compelling need. To this aim many methods have been implemented and their successful applications demonstrated in different settings. However, the reproducibility of automatically generated clusters is proving challenging and an analytical framework to distinguish spurious clusters from more stable entities, and presumably more biologically relevant ones, is still missing. One way to estimate cell clusters' stability is the evaluation of their consistent re-occurrence within- and between-algorithms, a metric that is commonly used to evaluate results from gene expression. Herein we report the usage and importance of cluster stability evaluations, when applied to results generated from three popular clustering algorithms - SPADE, FLOCK and PhenoGraph - run on four different data sets. These algorithms were shown to generate clusters with various degrees of statistical stability, many of them being unstable. By comparing the results of automated clustering with manually gated populations, we illustrate how information on cluster stability can assist towards a more rigorous and informed interpretation of clustering results. We also explore the relationships between statistical stability and other properties such as clusters' compactness and isolation, demonstrating that whilst cluster stability is linked to other properties it cannot be reliably predicted by any of them. Our study proposes the introduction of cluster stability as a necessary checkpoint for cluster interpretation and contributes to the construction of a more systematic and standardized analytical framework for the assessment of cytometry clustering results. © 2016 International Society for Advancement of Cytometry.

摘要

传统上,手动设门已应用于细胞计数数据集,以基于蛋白质表达来识别细胞。质谱细胞术的出现使得能够在细胞上同时测量更多数量的蛋白质,从而提供了一种在高维表达空间中定义细胞簇的方法。这种改进虽然为单细胞水平分析带来了前所未有的机遇,但使得用自动聚类逐步取代手动设门成为迫切需求。为此,已经实施了许多方法,并在不同环境中证明了它们的成功应用。然而,自动生成的簇的可重复性被证明具有挑战性,并且仍然缺少一个分析框架来区分虚假簇与更稳定的实体,以及可能更具生物学相关性的实体。估计细胞簇稳定性的一种方法是评估它们在算法内部和算法之间的一致重现性,这是一种常用于评估基因表达结果的指标。在此,我们报告了簇稳定性评估的用途和重要性,该评估应用于由三种流行的聚类算法——SPADE、FLOCK和PhenoGraph——在四个不同数据集上运行所生成的结果。结果表明,这些算法生成的簇具有不同程度的统计稳定性,其中许多是不稳定的。通过将自动聚类的结果与手动设门群体进行比较,我们说明了关于簇稳定性的信息如何有助于对聚类结果进行更严谨和明智的解释。我们还探讨了统计稳定性与其他属性(如簇的紧凑性和孤立性)之间的关系,表明虽然簇稳定性与其他属性相关,但不能通过其中任何一个属性可靠地预测它。我们的研究建议引入簇稳定性作为簇解释的必要检查点,并有助于构建一个更系统和标准化的分析框架来评估细胞计数聚类结果。© 2016国际细胞计量学促进协会。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验