Suppr超能文献

SCUBA为R中单细胞数据访问实现了一种与存储格式无关的应用程序编程接口。

SCUBA implements a storage format-agnostic API for single-cell data access in R.

作者信息

Showers William M, Desai Jairav, Engel Krysta L, Smith Clayton, Jordan Craig T, Gillen Austin E

机构信息

RefinedScience, Aurora, Colorado, USA.

Division of Hematology, University of Colorado Anschutz Medical Campus School of Medicine, Aurora, Colorado, USA.

出版信息

F1000Res. 2025 Jun 2;13:1256. doi: 10.12688/f1000research.154675.2. eCollection 2024.

Abstract

While robust tools exist for the analysis of single-cell datasets in both Python and R, interoperability is limited, and analysis tools generally only accept one object class. Considerable programming expertise is required to integrate tools across package ecosystems into a comprehensive analysis, due to their differing languages and internal data structures. This complicates validation of results and leads to inconsistent visualizations between analysis suites. Conversion between object formats is the most common solution, but this is difficult and error-prone due to the rapid pace of development of the analysis suites and their underlying data structures. To address this, we created SCUBA (Single-Cell Unified Backend API), an R package that implements a unified data access API for all common R and Python single-cell object formats. SCUBA extends the data access approach from the widely used Seurat package to SingleCellExperiment and anndata objects. SCUBA also implements new data-specific access functions for all supported object types. Performance scales well across all SCUBA-supported formats. In addition to performance, SCUBA offers several advantages over object conversion for the visualization and further analysis of pre-processed single-cell data. First, SCUBA extracts only data required for the operation at hand, leaving the original object unmodified. This process is simpler, less error prone, and less memory intensive than object conversion, which operates on the entire dataset. Second, code written with SCUBA can use any supported object class as input, with simple and consistent syntax across object formats. This allows a single analysis script or package (like our interactive single-cell browser, scExploreR) to work seamlessly with multiple object types, reducing the complexity of the code and improving both readability and reproducibility. Adoption of SCUBA will ultimately improve collaboration and reproducible research in single-cell analysis by lowering the barriers between package ecosystems.

摘要

虽然在Python和R中都存在用于分析单细胞数据集的强大工具,但它们之间的互操作性有限,并且分析工具通常只接受一种对象类。由于不同的语言和内部数据结构,要将跨包生态系统的工具集成到全面的分析中,需要相当多的编程专业知识。这使得结果验证变得复杂,并导致分析套件之间的可视化不一致。对象格式之间的转换是最常见的解决方案,但由于分析套件及其底层数据结构的快速发展,这既困难又容易出错。为了解决这个问题,我们创建了SCUBA(单细胞统一后端API),这是一个R包,它为所有常见的R和Python单细胞对象格式实现了统一的数据访问API。SCUBA将数据访问方法从广泛使用的Seurat包扩展到SingleCellExperiment和anndata对象。SCUBA还为所有支持的对象类型实现了新的数据特定访问函数。在所有SCUBA支持的格式中,性能都能很好地扩展。除了性能之外,SCUBA在预处理单细胞数据的可视化和进一步分析方面比对象转换具有几个优势。首先,SCUBA只提取手头操作所需的数据,而不修改原始对象。这个过程比在整个数据集上进行操作的对象转换更简单、更不易出错,并且内存占用更少。其次,使用SCUBA编写的代码可以使用任何支持的对象类作为输入,跨对象格式具有简单且一致的语法。这允许单个分析脚本或包(如我们的交互式单细胞浏览器scExploreR)与多种对象类型无缝协作,降低代码的复杂性并提高可读性和可重复性。采用SCUBA最终将通过降低包生态系统之间的障碍来改善单细胞分析中的协作和可重复研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/12351377/88a97d1d6bbf/f1000research-13-182803-g0000.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验