Department of Computer Science, Princeton University, Princeton, NJ 08540, USA; Lewis-Sigler Institute of Integrative Genomics, Princeton University, Princeton, NJ 08540, USA.
Lewis-Sigler Institute of Integrative Genomics, Princeton University, Princeton, NJ 08540, USA.
Cell Syst. 2024 Oct 16;15(10):982-990.e5. doi: 10.1016/j.cels.2024.09.003. Epub 2024 Oct 3.
To facilitate single-cell multi-omics analysis and improve reproducibility, we present single-cell pipeline for end-to-end data integration (SPEEDI), a fully automated end-to-end framework for batch inference, data integration, and cell-type labeling. SPEEDI introduces data-driven batch inference and transforms the often heterogeneous data matrices obtained from different samples into a uniformly annotated and integrated dataset. Without requiring user input, it automatically selects parameters and executes pre-processing, sample integration, and cell-type mapping. It can also perform downstream analyses of differential signals between treatment conditions and gene functional modules. SPEEDI's data-driven batch-inference method works with widely used integration and cell-typing tools. By developing data-driven batch inference, providing full end-to-end automation, and eliminating parameter selection, SPEEDI improves reproducibility and lowers the barrier to obtaining biological insight from these valuable single-cell datasets. The SPEEDI interactive web application can be accessed at https://speedi.princeton.edu/. A record of this paper's transparent peer review process is included in the supplemental information.
为了促进单细胞多组学分析并提高可重复性,我们提出了用于端到端数据集成的单细胞分析流水线 (SPEEDI),这是一个完全自动化的端到端框架,用于批量推断、数据集成和细胞类型标记。SPEEDI 引入了数据驱动的批量推断,并将从不同样本获得的通常异构数据矩阵转换为统一注释和集成的数据集。它无需用户输入即可自动选择参数并执行预处理、样本集成和细胞类型映射。它还可以对处理条件和基因功能模块之间的差异信号进行下游分析。SPEEDI 的数据驱动批量推断方法适用于广泛使用的集成和细胞分型工具。通过开发数据驱动的批量推断、提供全面的端到端自动化以及消除参数选择,SPEEDI 提高了可重复性并降低了从这些有价值的单细胞数据集中获得生物学见解的门槛。SPEEDI 的交互式网络应用程序可在 https://speedi.princeton.edu/ 访问。本文的透明同行评审过程记录包含在补充信息中。