Mocking T R, Duetz C, van Kuijk B J, Westers T M, Cloos J, Bachas C
Department of Hematology, Cancer Center Amsterdam, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.
Cytometry A. 2023 Oct;103(10):818-829. doi: 10.1002/cyto.a.24774. Epub 2023 Jul 6.
Although most modern techniques and analysis methods in multiparameter flow cytometry (MFC) allow for increased dimensionality for the characterization and quantification of cell populations, most MFC applications depend on flow cytometers measuring relatively small (<16) numbers of parameters. When more markers than the available parameters need to be acquired, these are commonly distributed over multiple independent measurements that include a backbone of common markers. Several methods have been proposed to impute values for combinations of markers that were not measured simultaneously. These imputation methods are frequently used without proper validation and knowledge of their effects on data analysis. We evaluated the performance of existing imputation software (Infinicyt, CyTOFmerge, CytoBackBone, and cyCombine) in approximating known measured expression data in terms of similarity in visual appearance, cell expression, and gating in different datasets by splitting MFC samples into separate measurements with partially overlapping markers and re-calculating missing marker expression. Out of the assessed packages, CyTOFmerge showed the most accurate approximation of the known expression in terms of similar expression values and concordance with manual gating, with a mean F-score between 0.53 and 0.87 when retrieving cell populations in different datasets. Performance remained inadequate for all methods, with only limited similarity at the cell level. In conclusion, the use of imputed MFC data should take such limitations into account and include independent validation of results to justify conclusions.
尽管多参数流式细胞术(MFC)中的大多数现代技术和分析方法能够增加细胞群体表征和定量的维度,但大多数MFC应用依赖于测量相对较少(<16个)参数的流式细胞仪。当需要获取的标记物数量超过可用参数时,这些标记物通常分布在多个包含共同标记物主干的独立测量中。已经提出了几种方法来估算未同时测量的标记物组合的值。这些估算方法在没有适当验证和了解其对数据分析影响的情况下经常被使用。我们通过将MFC样本分成具有部分重叠标记物的单独测量,并重新计算缺失标记物的表达,评估了现有估算软件(Infinicyt、CyTOFmerge、CytoBackBone和cyCombine)在不同数据集中根据视觉外观、细胞表达和门控的相似性来近似已知测量表达数据的性能。在评估的软件包中,CyTOFmerge在相似表达值和与手动门控的一致性方面,对已知表达的近似最为准确,在检索不同数据集中的细胞群体时,平均F分数在0.53至0.87之间。所有方法的性能仍然不足,在细胞水平上只有有限的相似性。总之,使用估算的MFC数据应考虑到这些局限性,并包括对结果的独立验证以证明结论的合理性。