Suppr超能文献

FlowGrid能够对非常大的单细胞RNA测序数据进行快速聚类。

FlowGrid enables fast clustering of very large single-cell RNA-seq data.

作者信息

Fang Xiunan, Ho Joshua W K

机构信息

School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China.

Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science Park, New Territories, Hong Kong SAR, China.

出版信息

Bioinformatics. 2021 Dec 22;38(1):282-283. doi: 10.1093/bioinformatics/btab521.

Abstract

MOTIVATION

Scalable clustering algorithms are needed to analyze millions of cells in single cell RNA-seq (scRNA-seq) data.

RESULTS

Here, we present an open source python package called FlowGrid that can integrate into the Scanpy workflow to perform clustering on very large scRNA-seq datasets. FlowGrid implements a fast density-based clustering algorithm originally designed for flow cytometry data analysis. We introduce a new automated parameter tuning procedure, and show that FlowGrid can achieve comparable clustering accuracy as state-of-the-art clustering algorithms but at a substantially reduced run time for very large single cell RNA-seq datasets. For example, FlowGrid can complete a one-hour clustering task for one million cells in about five min.

AVAILABILITY AND IMPLEMENTATION

https://github.com/holab-hku/FlowGrid.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

需要可扩展的聚类算法来分析单细胞RNA测序(scRNA-seq)数据中的数百万个细胞。

结果

在此,我们展示了一个名为FlowGrid的开源Python软件包,它可以集成到Scanpy工作流程中,以便对非常大的scRNA-seq数据集进行聚类。FlowGrid实现了一种最初为流式细胞术数据分析设计的基于密度的快速聚类算法。我们引入了一种新的自动参数调整程序,并表明FlowGrid可以实现与最先进的聚类算法相当的聚类精度,但对于非常大的单细胞RNA-seq数据集,其运行时间大幅减少。例如,FlowGrid可以在大约五分钟内完成对一百万个细胞的一小时聚类任务。

可用性和实现方式

https://github.com/holab-hku/FlowGrid。

补充信息

补充数据可在《生物信息学》在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验