Ashhurst Thomas Myles, Marsh-Wakefield Felix, Putri Givanna Haryono, Spiteri Alanna Gabrielle, Shinko Diana, Read Mark Norman, Smith Adrian Lloyd, King Nicholas Jonathan Cole
Sydney Cytometry Core Research Facility, Charles Perkins Centre, Centenary Institute and The University of Sydney, Sydney, New South Wales, Australia.
Marie Bashir Institute for Infectious Diseases and Biosecurity, The University of Sydney, Sydney, New South Wales, Australia.
Cytometry A. 2022 Mar;101(3):237-253. doi: 10.1002/cyto.a.24350. Epub 2021 Apr 26.
As the size and complexity of high-dimensional (HD) cytometry data continue to expand, comprehensive, scalable, and methodical computational analysis approaches are essential. Yet, contemporary clustering and dimensionality reduction tools alone are insufficient to analyze or reproduce analyses across large numbers of samples, batches, or experiments. Moreover, approaches that allow for the integration of data across batches or experiments are not well incorporated into computational toolkits to allow for streamlined workflows. Here we present Spectre, an R package that enables comprehensive end-to-end integration and analysis of HD cytometry data from different batches or experiments. Spectre streamlines the analytical stages of raw data pre-processing, batch alignment, data integration, clustering, dimensionality reduction, visualization, and population labelling, as well as quantitative and statistical analysis. Critically, the fundamental data structures used within Spectre, along with the implementation of machine learning classifiers, allow for the scalable analysis of very large HD datasets, generated by flow cytometry, mass cytometry, or spectral cytometry. Using open and flexible data structures, Spectre can also be used to analyze data generated by single-cell RNA sequencing or HD imaging technologies, such as Imaging Mass Cytometry. The simple, clear, and modular design of analysis workflows allow these tools to be used by bioinformaticians and laboratory scientists alike. Spectre is available as an R package or Docker container. R code is available on Github (https://github.com/immunedynamics/spectre).
随着高维(HD)细胞计数数据的规模和复杂性不断扩大,全面、可扩展且有条理的计算分析方法至关重要。然而,仅靠当代的聚类和降维工具不足以分析或重现大量样本、批次或实验的分析结果。此外,能够整合不同批次或实验数据的方法并未很好地纳入计算工具包以实现简化的工作流程。在此,我们展示了Spectre,一个R软件包,它能够对来自不同批次或实验的HD细胞计数数据进行全面的端到端整合与分析。Spectre简化了原始数据预处理、批次对齐、数据整合、聚类、降维、可视化和群体标记等分析阶段,以及定量和统计分析。至关重要的是,Spectre中使用的基本数据结构以及机器学习分类器的实现,使得对由流式细胞术、质谱细胞术或光谱细胞术生成的非常大的HD数据集进行可扩展分析成为可能。利用开放且灵活的数据结构,Spectre还可用于分析由单细胞RNA测序或HD成像技术(如成像质谱细胞术)生成的数据。分析工作流程简单、清晰且模块化的设计使这些工具可供生物信息学家和实验室科学家使用。Spectre可作为R软件包或Docker容器获取。R代码可在Github(https://github.com/immunedynamics/spectre)上获取。