• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用FlowGrid对单细胞流式细胞术数据进行超快速聚类。

Ultrafast clustering of single-cell flow cytometry data using FlowGrid.

作者信息

Ye Xiaoxin, Ho Joshua W K

机构信息

Victor Chang Cardiac Research Institute, Sydney, Australia.

University of New South Wales, Sydney, Australia.

出版信息

BMC Syst Biol. 2019 Apr 5;13(Suppl 2):35. doi: 10.1186/s12918-019-0690-2.

DOI:10.1186/s12918-019-0690-2
PMID:30953498
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6449887/
Abstract

BACKGROUND

Flow cytometry is a popular technology for quantitative single-cell profiling of cell surface markers. It enables expression measurement of tens of cell surface protein markers in millions of single cells. It is a powerful tool for discovering cell sub-populations and quantifying cell population heterogeneity. Traditionally, scientists use manual gating to identify cell types, but the process is subjective and is not effective for large multidimensional data. Many clustering algorithms have been developed to analyse these data but most of them are not scalable to very large data sets with more than ten million cells.

RESULTS

Here, we present a new clustering algorithm that combines the advantages of density-based clustering algorithm DBSCAN with the scalability of grid-based clustering. This new clustering algorithm is implemented in python as an open source package, FlowGrid. FlowGrid is memory efficient and scales linearly with respect to the number of cells. We have evaluated the performance of FlowGrid against other state-of-the-art clustering programs and found that FlowGrid produces similar clustering results but with substantially less time. For example, FlowGrid is able to complete a clustering task on a data set of 23.6 million cells in less than 12 seconds, while other algorithms take more than 500 seconds or get into error.

CONCLUSIONS

FlowGrid is an ultrafast clustering algorithm for large single-cell flow cytometry data. The source code is available at https://github.com/VCCRI/FlowGrid .

摘要

背景

流式细胞术是一种用于细胞表面标志物定量单细胞分析的常用技术。它能够对数以百万计的单细胞中的数十种细胞表面蛋白标志物进行表达测量。它是发现细胞亚群和量化细胞群体异质性的强大工具。传统上,科学家使用手动设门来识别细胞类型,但该过程具有主观性,并且对于大型多维数据无效。已经开发了许多聚类算法来分析这些数据,但其中大多数对于超过一千万个细胞的非常大的数据集不可扩展。

结果

在此,我们提出了一种新的聚类算法,该算法结合了基于密度的聚类算法DBSCAN的优点和基于网格的聚类的可扩展性。这种新的聚类算法在Python中作为开源包FlowGrid实现。FlowGrid内存效率高,并且相对于细胞数量呈线性扩展。我们已将FlowGrid的性能与其他最先进的聚类程序进行了评估,发现FlowGrid产生的聚类结果相似,但所需时间大大减少。例如,FlowGrid能够在不到12秒的时间内完成对2360万个细胞的数据集的聚类任务,而其他算法则需要超过500秒或出现错误。

结论

FlowGrid是一种用于大型单细胞流式细胞术数据的超快速聚类算法。源代码可在https://github.com/VCCRI/FlowGrid获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe2/6449887/938980acf83a/12918_2019_690_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe2/6449887/7393cc5c2fae/12918_2019_690_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe2/6449887/62a63da1b427/12918_2019_690_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe2/6449887/28e63d890b3d/12918_2019_690_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe2/6449887/938980acf83a/12918_2019_690_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe2/6449887/7393cc5c2fae/12918_2019_690_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe2/6449887/62a63da1b427/12918_2019_690_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe2/6449887/28e63d890b3d/12918_2019_690_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe2/6449887/938980acf83a/12918_2019_690_Fig4_HTML.jpg

相似文献

1
Ultrafast clustering of single-cell flow cytometry data using FlowGrid.使用FlowGrid对单细胞流式细胞术数据进行超快速聚类。
BMC Syst Biol. 2019 Apr 5;13(Suppl 2):35. doi: 10.1186/s12918-019-0690-2.
2
FlowGrid enables fast clustering of very large single-cell RNA-seq data.FlowGrid能够对非常大的单细胞RNA测序数据进行快速聚类。
Bioinformatics. 2021 Dec 22;38(1):282-283. doi: 10.1093/bioinformatics/btab521.
3
Comparison of clustering methods for high-dimensional single-cell flow and mass cytometry data.高维单细胞流式细胞术和质谱流式细胞术数据聚类方法的比较
Cytometry A. 2016 Dec;89(12):1084-1096. doi: 10.1002/cyto.a.23030. Epub 2016 Dec 19.
4
Predicting Cell Populations in Single Cell Mass Cytometry Data.单细胞质谱流式细胞术数据中的细胞群体预测。
Cytometry A. 2019 Jul;95(7):769-781. doi: 10.1002/cyto.a.23738. Epub 2019 Mar 12.
5
PARC: ultrafast and accurate clustering of phenotypic data of millions of single cells.PARC:对数百万个单细胞的表型数据进行超快速和准确的聚类。
Bioinformatics. 2020 May 1;36(9):2778-2786. doi: 10.1093/bioinformatics/btaa042.
6
CyCadas: accelerating interactive annotation and analysis of clustered cytometry data.CyCadas:加速聚类流式细胞术数据的交互式注释和分析。
Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae595.
7
Data reduction for spectral clustering to analyze high throughput flow cytometry data.用于分析高通量流式细胞术数据的谱聚类数据约简。
BMC Bioinformatics. 2010 Jul 28;11:403. doi: 10.1186/1471-2105-11-403.
8
Misty Mountain clustering: application to fast unsupervised flow cytometry gating.迷雾山脉聚类:在快速无监督流式细胞术门控中的应用。
BMC Bioinformatics. 2010 Oct 9;11:502. doi: 10.1186/1471-2105-11-502.
9
Scalable clustering algorithms for continuous environmental flow cytometry.可扩展的连续环境流式细胞术聚类算法。
Bioinformatics. 2016 Feb 1;32(3):417-23. doi: 10.1093/bioinformatics/btv594. Epub 2015 Oct 17.
10
Efficient cytometry analysis with FlowSOM in Python boosts interoperability with other single-cell tools.使用 Python 中的 FlowSOM 进行高效的细胞计数分析可提高与其他单细胞工具的互操作性。
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae179.

引用本文的文献

1
Machine Learning Methods in Clinical Flow Cytometry.临床流式细胞术中的机器学习方法
Cancers (Basel). 2025 Feb 1;17(3):483. doi: 10.3390/cancers17030483.
2
Comprehensive evaluation and practical guideline of gating methods for high-dimensional cytometry data: manual gating, unsupervised clustering, and auto-gating.高维细胞计数数据门控方法的综合评估与实用指南:手工门控、无监督聚类和自动门控。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae633.
3
CyCadas: accelerating interactive annotation and analysis of clustered cytometry data.

本文引用的文献

1
Gating mass cytometry data by deep learning.通过深度学习对门控质谱流式细胞术数据进行分类。
Bioinformatics. 2017 Nov 1;33(21):3423-3430. doi: 10.1093/bioinformatics/btx448.
2
Comparison of clustering methods for high-dimensional single-cell flow and mass cytometry data.高维单细胞流式细胞术和质谱流式细胞术数据聚类方法的比较
Cytometry A. 2016 Dec;89(12):1084-1096. doi: 10.1002/cyto.a.23030. Epub 2016 Dec 19.
3
Computational flow cytometry: helping to make sense of high-dimensional immunology data.计算流式细胞术:帮助理解高维免疫学数据。
CyCadas:加速聚类流式细胞术数据的交互式注释和分析。
Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae595.
4
Efficient sex separation by exploiting differential alternative splicing of a dominant marker in Aedes aegypti.通过利用埃及伊蚊中显性标记的差异选择性剪接进行有效的性别分离。
PLoS Genet. 2023 Nov 27;19(11):e1011065. doi: 10.1371/journal.pgen.1011065. eCollection 2023 Nov.
5
INFLECT: an R-package for cytometry cluster evaluation using marker modality.INFLECT:一个使用标记模式评估流式细胞术聚类的 R 包。
BMC Bioinformatics. 2022 Nov 16;23(1):487. doi: 10.1186/s12859-022-05018-w.
6
Flow Cytometry: A Blessing and a Curse.流式细胞术:既是福音也是诅咒。
Biomedicines. 2021 Nov 4;9(11):1613. doi: 10.3390/biomedicines9111613.
7
Single-Cell Multiomics Analysis for Drug Discovery.用于药物发现的单细胞多组学分析
Metabolites. 2021 Oct 25;11(11):729. doi: 10.3390/metabo11110729.
8
Optimal distribution-preserving downsampling of large biomedical data sets (opdisDownsampling).大生物医学数据集的最优分布保持降采样(opdisDownsampling)。
PLoS One. 2021 Aug 5;16(8):e0255838. doi: 10.1371/journal.pone.0255838. eCollection 2021.
9
Analyzing high-dimensional cytometry data using FlowSOM.使用 FlowSOM 分析高维流式细胞术数据。
Nat Protoc. 2021 Aug;16(8):3775-3801. doi: 10.1038/s41596-021-00550-0. Epub 2021 Jun 25.
10
MET Exon 14 Skipping: A Case Study for the Detection of Genetic Variants in Cancer Driver Genes by Deep Learning.MET 外显子 14 跳跃:深度学习检测癌症驱动基因中遗传变异的案例研究。
Int J Mol Sci. 2021 Apr 19;22(8):4217. doi: 10.3390/ijms22084217.
Nat Rev Immunol. 2016 Jul;16(7):449-62. doi: 10.1038/nri.2016.56. Epub 2016 Jun 20.
4
BayesFlow: latent modeling of flow cytometry cell populations.贝叶斯流:流式细胞术细胞群体的潜在建模
BMC Bioinformatics. 2016 Jan 12;17:25. doi: 10.1186/s12859-015-0862-z.
5
Scalable clustering algorithms for continuous environmental flow cytometry.可扩展的连续环境流式细胞术聚类算法。
Bioinformatics. 2016 Feb 1;32(3):417-23. doi: 10.1093/bioinformatics/btv594. Epub 2015 Oct 17.
6
FlowSOM: Using self-organizing maps for visualization and interpretation of cytometry data.FlowSOM:使用自组织映射对细胞计数数据进行可视化和解释
Cytometry A. 2015 Jul;87(7):636-45. doi: 10.1002/cyto.a.22625. Epub 2015 Jan 8.
7
Critical assessment of automated flow cytometry data analysis techniques.自动化流式细胞术数据分析技术的批判性评估。
Nat Methods. 2013 Mar;10(3):228-38. doi: 10.1038/nmeth.2365. Epub 2013 Feb 10.
8
flowPeaks: a fast unsupervised clustering for flow cytometry data via K-means and density peak finding.flowPeaks:一种基于 K-means 和密度峰值发现的流式细胞术数据快速无监督聚类方法。
Bioinformatics. 2012 Aug 1;28(15):2052-8. doi: 10.1093/bioinformatics/bts300. Epub 2012 May 17.
9
Elucidation of seventeen human peripheral blood B-cell subsets and quantification of the tetanus response using a density-based method for the automated identification of cell populations in multidimensional flow cytometry data.运用基于密度的方法阐明十七个人外周血 B 细胞亚群,并对破伤风应答进行定量分析,该方法可用于多维流式细胞术数据中细胞群体的自动识别。
Cytometry B Clin Cytom. 2010;78 Suppl 1(Suppl 1):S69-82. doi: 10.1002/cyto.b.20554.