SCUBA为R中单细胞数据访问实现了一种与存储格式无关的应用程序编程接口。

SCUBA implements a storage format-agnostic API for single-cell data access in R.

作者信息

Showers William M, Desai Jairav, Engel Krysta L, Smith Clayton, Jordan Craig T, Gillen Austin E

机构信息

RefinedScience, Aurora, Colorado, USA.

Division of Hematology, University of Colorado Anschutz Medical Campus School of Medicine, Aurora, Colorado, USA.

出版信息

F1000Res. 2025 Jun 2;13:1256. doi: 10.12688/f1000research.154675.2. eCollection 2024.

DOI:10.12688/f1000research.154675.2

PMID:40822437

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12351237/

Abstract

While robust tools exist for the analysis of single-cell datasets in both Python and R, interoperability is limited, and analysis tools generally only accept one object class. Considerable programming expertise is required to integrate tools across package ecosystems into a comprehensive analysis, due to their differing languages and internal data structures. This complicates validation of results and leads to inconsistent visualizations between analysis suites. Conversion between object formats is the most common solution, but this is difficult and error-prone due to the rapid pace of development of the analysis suites and their underlying data structures. To address this, we created SCUBA (Single-Cell Unified Backend API), an R package that implements a unified data access API for all common R and Python single-cell object formats. SCUBA extends the data access approach from the widely used Seurat package to SingleCellExperiment and anndata objects. SCUBA also implements new data-specific access functions for all supported object types. Performance scales well across all SCUBA-supported formats. In addition to performance, SCUBA offers several advantages over object conversion for the visualization and further analysis of pre-processed single-cell data. First, SCUBA extracts only data required for the operation at hand, leaving the original object unmodified. This process is simpler, less error prone, and less memory intensive than object conversion, which operates on the entire dataset. Second, code written with SCUBA can use any supported object class as input, with simple and consistent syntax across object formats. This allows a single analysis script or package (like our interactive single-cell browser, scExploreR) to work seamlessly with multiple object types, reducing the complexity of the code and improving both readability and reproducibility. Adoption of SCUBA will ultimately improve collaboration and reproducible research in single-cell analysis by lowering the barriers between package ecosystems.

摘要

虽然在Python和R中都存在用于分析单细胞数据集的强大工具，但它们之间的互操作性有限，并且分析工具通常只接受一种对象类。由于不同的语言和内部数据结构，要将跨包生态系统的工具集成到全面的分析中，需要相当多的编程专业知识。这使得结果验证变得复杂，并导致分析套件之间的可视化不一致。对象格式之间的转换是最常见的解决方案，但由于分析套件及其底层数据结构的快速发展，这既困难又容易出错。为了解决这个问题，我们创建了SCUBA（单细胞统一后端API），这是一个R包，它为所有常见的R和Python单细胞对象格式实现了统一的数据访问API。SCUBA将数据访问方法从广泛使用的Seurat包扩展到SingleCellExperiment和anndata对象。SCUBA还为所有支持的对象类型实现了新的数据特定访问函数。在所有SCUBA支持的格式中，性能都能很好地扩展。除了性能之外，SCUBA在预处理单细胞数据的可视化和进一步分析方面比对象转换具有几个优势。首先，SCUBA只提取手头操作所需的数据，而不修改原始对象。这个过程比在整个数据集上进行操作的对象转换更简单、更不易出错，并且内存占用更少。其次，使用SCUBA编写的代码可以使用任何支持的对象类作为输入，跨对象格式具有简单且一致的语法。这允许单个分析脚本或包（如我们的交互式单细胞浏览器scExploreR）与多种对象类型无缝协作，降低代码的复杂性并提高可读性和可重复性。采用SCUBA最终将通过降低包生态系统之间的障碍来改善单细胞分析中的协作和可重复研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/12351377/88a97d1d6bbf/f1000research-13-182803-g0000.jpg

相似文献

SCUBA implements a storage format-agnostic API for single-cell data access in R.SCUBA为R中单细胞数据访问实现了一种与存储格式无关的应用程序编程接口。

F1000Res. 2025 Jun 2;13:1256. doi: 10.12688/f1000research.154675.2. eCollection 2024.

Prescription of Controlled Substances: Benefits and Risks管制药品的处方：益处与风险

Short-Term Memory Impairment短期记忆障碍

Sexual Harassment and Prevention Training性骚扰与预防培训

Elbow Fractures Overview肘部骨折概述

Audit and feedback: effects on professional practice.审核与反馈：对专业实践的影响

Cochrane Database Syst Rev. 2025 Mar 25;3(3):CD000259. doi: 10.1002/14651858.CD000259.pub4.

Healthcare workers' informal uses of mobile phones and other mobile devices to support their work: a qualitative evidence synthesis.医护人员非正规使用手机和其他移动设备来支持工作：定性证据综合评价。

Cochrane Database Syst Rev. 2024 Aug 27;8(8):CD015705. doi: 10.1002/14651858.CD015705.pub2.

Interventions to improve safe and effective medicines use by consumers: an overview of systematic reviews.改善消费者安全有效用药的干预措施：系统评价概述

Cochrane Database Syst Rev. 2014 Apr 29;2014(4):CD007768. doi: 10.1002/14651858.CD007768.pub3.

Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病：网络荟萃分析。

Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.

Management of urinary stones by experts in stone disease (ESD 2025).结石病专家对尿路结石的管理（2025年结石病专家共识）

Arch Ital Urol Androl. 2025 Jun 30;97(2):14085. doi: 10.4081/aiua.2025.14085.

引用本文的文献

scExploreR: a flexible platform for democratized analysis of multimodal single-cell data by non-programmers.scExploreR：一个供非程序员对多模态单细胞数据进行民主化分析的灵活平台。

bioRxiv. 2025 Jun 1:2025.05.28.656649. doi: 10.1101/2025.05.28.656649.

本文引用的文献

Integrating single-cell multi-omics and prior biological knowledge for a functional characterization of the immune system.整合单细胞多组学和先前的生物学知识以对免疫系统进行功能表征。

Nat Immunol. 2024 Mar;25(3):405-417. doi: 10.1038/s41590-024-01768-2. Epub 2024 Feb 27.

Dictionary learning for integrative, multimodal and scalable single-cell analysis.基于字典学习的综合、多模态和可扩展的单细胞分析。

Nat Biotechnol. 2024 Feb;42(2):293-304. doi: 10.1038/s41587-023-01767-y. Epub 2023 May 25.

The scverse project provides a computational ecosystem for single-cell omics data analysis.scverse项目为单细胞组学数据分析提供了一个计算生态系统。

Nat Biotechnol. 2023 May;41(5):604-606. doi: 10.1038/s41587-023-01733-8.

A cellular hierarchy framework for understanding heterogeneity and predicting drug response in acute myeloid leukemia.一种用于理解急性髓系白血病异质性和预测药物反应的细胞层次结构框架。

Nat Med. 2022 Jun;28(6):1212-1223. doi: 10.1038/s41591-022-01819-x. Epub 2022 May 26.

MUON: multimodal omics analysis framework.MUON：多模态组学分析框架。

Genome Biol. 2022 Feb 1;23(1):42. doi: 10.1186/s13059-021-02577-8.

scDIOR: single cell RNA-seq data IO software.scDIOR：单细胞 RNA-seq 数据输入输出软件。

BMC Bioinformatics. 2022 Jan 6;23(1):16. doi: 10.1186/s12859-021-04528-3.

Single-cell proteo-genomic reference maps of the hematopoietic system enable the purification and massive profiling of precisely defined cell states.单细胞蛋白质基因组参考图谱可对造血系统进行精确定义的细胞状态的纯化和大规模分析。

Nat Immunol. 2021 Dec;22(12):1577-1589. doi: 10.1038/s41590-021-01059-0. Epub 2021 Nov 22.

Array programming with NumPy.使用 NumPy 进行数组编程。

Nature. 2020 Sep;585(7825):357-362. doi: 10.1038/s41586-020-2649-2. Epub 2020 Sep 16.

Comparison of visualization tools for single-cell RNAseq data.单细胞RNA测序数据可视化工具的比较

NAR Genom Bioinform. 2020 Sep;2(3):lqaa052. doi: 10.1093/nargab/lqaa052. Epub 2020 Jul 29.

Author Correction: SciPy 1.0: fundamental algorithms for scientific computing in Python.作者更正：SciPy 1.0：Python中科学计算的基础算法。

Nat Methods. 2020 Mar;17(3):352. doi: 10.1038/s41592-020-0772-5.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

SCUBA为R中单细胞数据访问实现了一种与存储格式无关的应用程序编程接口。

SCUBA implements a storage format-agnostic API for single-cell data access in R.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献