• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

optimalFlow:流式细胞术门控和群体匹配的最优传输方法。

optimalFlow: optimal transport approach to flow cytometry gating and population matching.

机构信息

Departamento de Estadística e Investigación Operativa, Universidad de Valladolid, Calle Paseo de Belén, Valladolid, Spain.

IMUVA, Calle Paseo de Belén, Valladolid, Spain.

出版信息

BMC Bioinformatics. 2020 Oct 27;21(1):479. doi: 10.1186/s12859-020-03795-w.

DOI:10.1186/s12859-020-03795-w
PMID:33109072
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7590740/
Abstract

BACKGROUND

Data obtained from flow cytometry present pronounced variability due to biological and technical reasons. Biological variability is a well-known phenomenon produced by measurements on different individuals, with different characteristics such as illness, age, sex, etc. The use of different settings for measurement, the variation of the conditions during experiments and the different types of flow cytometers are some of the technical causes of variability. This mixture of sources of variability makes the use of supervised machine learning for identification of cell populations difficult. The present work is conceived as a combination of strategies to facilitate the task of supervised gating.

RESULTS

We propose optimalFlowTemplates, based on a similarity distance and Wasserstein barycenters, which clusters cytometries and produces prototype cytometries for the different groups. We show that supervised learning, restricted to the new groups, performs better than the same techniques applied to the whole collection. We also present optimalFlowClassification, which uses a database of gated cytometries and optimalFlowTemplates to assign cell types to a new cytometry. We show that this procedure can outperform state of the art techniques in the proposed datasets. Our code is freely available as optimalFlow, a Bioconductor R package at https://bioconductor.org/packages/optimalFlow .

CONCLUSIONS

optimalFlowTemplates + optimalFlowClassification addresses the problem of using supervised learning while accounting for biological and technical variability. Our methodology provides a robust automated gating workflow that handles the intrinsic variability of flow cytometry data well. Our main innovation is the methodology itself and the optimal transport techniques that we apply to flow cytometry analysis.

摘要

背景

流式细胞术获得的数据由于生物学和技术原因而呈现出明显的可变性。生物学变异性是一种众所周知的现象,是由不同个体的测量产生的,这些个体具有不同的特征,如疾病、年龄、性别等。测量时使用不同的设置、实验过程中条件的变化以及不同类型的流式细胞仪是造成可变性的一些技术原因。这种混合的变异性来源使得使用有监督的机器学习来识别细胞群体变得困难。本工作是结合了一些策略,以方便有监督的门控任务。

结果

我们提出了最优 FlowTemplates,它基于相似性距离和 Wasserstein 重心,对流式细胞仪进行聚类,并为不同的组产生原型流式细胞仪。我们表明,受监督的学习仅限于新的组,比应用于整个集合的相同技术表现得更好。我们还提出了最优 FlowClassification,它使用门控流式细胞仪数据库和最优 FlowTemplates 来将细胞类型分配给新的流式细胞仪。我们表明,该过程可以在提出的数据集上优于最先进的技术。我们的代码可以在 https://bioconductor.org/packages/optimalFlow 作为 Bioconductor R 包中的 optimalFlow 免费获得。

结论

optimalFlowTemplates + optimalFlowClassification 解决了在考虑生物学和技术可变性的情况下使用有监督学习的问题。我们的方法提供了一种稳健的自动化门控工作流程,很好地处理了流式细胞术数据的固有可变性。我们的主要创新是我们应用于流式细胞术分析的方法本身和最优传输技术。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5ff/7590740/db5f730cc68c/12859_2020_3795_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5ff/7590740/ded7a161b256/12859_2020_3795_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5ff/7590740/015ae03077ca/12859_2020_3795_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5ff/7590740/7cc16fb3bc02/12859_2020_3795_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5ff/7590740/c5a19a82a4c0/12859_2020_3795_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5ff/7590740/c2976a179a06/12859_2020_3795_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5ff/7590740/200c7ad25b4c/12859_2020_3795_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5ff/7590740/db5f730cc68c/12859_2020_3795_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5ff/7590740/ded7a161b256/12859_2020_3795_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5ff/7590740/015ae03077ca/12859_2020_3795_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5ff/7590740/7cc16fb3bc02/12859_2020_3795_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5ff/7590740/c5a19a82a4c0/12859_2020_3795_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5ff/7590740/c2976a179a06/12859_2020_3795_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5ff/7590740/200c7ad25b4c/12859_2020_3795_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5ff/7590740/db5f730cc68c/12859_2020_3795_Fig7_HTML.jpg

相似文献

1
optimalFlow: optimal transport approach to flow cytometry gating and population matching.optimalFlow:流式细胞术门控和群体匹配的最优传输方法。
BMC Bioinformatics. 2020 Oct 27;21(1):479. doi: 10.1186/s12859-020-03795-w.
2
Misty Mountain clustering: application to fast unsupervised flow cytometry gating.迷雾山脉聚类:在快速无监督流式细胞术门控中的应用。
BMC Bioinformatics. 2010 Oct 9;11:502. doi: 10.1186/1471-2105-11-502.
3
flowDensity: reproducing manual gating of flow cytometry data by automated density-based cell population identification.流密度:通过基于密度的自动细胞群体识别再现流式细胞术数据的手动设门
Bioinformatics. 2015 Feb 15;31(4):606-7. doi: 10.1093/bioinformatics/btu677. Epub 2014 Oct 16.
4
An open-source solution for advanced imaging flow cytometry data analysis using machine learning.一种使用机器学习进行高级成像流式细胞术数据分析的开源解决方案。
Methods. 2017 Jan 1;112:201-210. doi: 10.1016/j.ymeth.2016.08.018. Epub 2016 Sep 2.
5
Automated Flow Cytometric MRD Assessment in Childhood Acute B- Lymphoblastic Leukemia Using Supervised Machine Learning.基于监督机器学习的儿童急性 B 淋巴细胞白血病微小残留病灶的自动化流式细胞术评估
Cytometry A. 2019 Sep;95(9):966-975. doi: 10.1002/cyto.a.23852. Epub 2019 Jul 7.
6
Scalable clustering algorithms for continuous environmental flow cytometry.可扩展的连续环境流式细胞术聚类算法。
Bioinformatics. 2016 Feb 1;32(3):417-23. doi: 10.1093/bioinformatics/btv594. Epub 2015 Oct 17.
7
Flow cytometry bioinformatics.流式细胞术生物信息学。
PLoS Comput Biol. 2013;9(12):e1003365. doi: 10.1371/journal.pcbi.1003365. Epub 2013 Dec 5.
8
flowClust: a Bioconductor package for automated gating of flow cytometry data.flowClust:一个用于自动门控流式细胞术数据的Bioconductor软件包。
BMC Bioinformatics. 2009 May 14;10:145. doi: 10.1186/1471-2105-10-145.
9
Rapid cell population identification in flow cytometry data.流式细胞术数据中快速的细胞群体识别。
Cytometry A. 2011 Jan;79(1):6-13. doi: 10.1002/cyto.a.21007.
10
flowPhyto: enabling automated analysis of microscopic algae from continuous flow cytometric data.flowPhyto:实现基于连续流式细胞术数据的微观藻类自动分析。
Bioinformatics. 2011 Mar 1;27(5):732-3. doi: 10.1093/bioinformatics/btr003. Epub 2011 Jan 5.

引用本文的文献

1
Optimal transport reveals immune perturbation and fingerprints over time in COVID-19 vaccination.最优传输揭示了新冠疫苗接种过程中随时间变化的免疫扰动和特征。
Exp Biol Med (Maywood). 2025 May 21;250:10445. doi: 10.3389/ebm.2025.10445. eCollection 2025.
2
QOT: Quantized Optimal Transport for sample-level distance matrix in single-cell omics.QOT:用于单细胞组学中样本级距离矩阵的量化最优传输
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae713.
3
QOT: Efficient Computation of Sample Level Distance Matrix from Single-Cell Omics Data through Quantized Optimal Transport.

本文引用的文献

1
flowLearn: fast and precise identification and quality checking of cell populations in flow cytometry.flowLearn:流式细胞术中细胞群的快速准确识别和质量检查。
Bioinformatics. 2018 Jul 1;34(13):2245-2253. doi: 10.1093/bioinformatics/bty082.
2
QFMatch: multidimensional flow and mass cytometry samples alignment.QFMatch:多维流式和质谱细胞术样本对齐。
Sci Rep. 2018 Feb 19;8(1):3291. doi: 10.1038/s41598-018-21444-4.
3
Gating mass cytometry data by deep learning.通过深度学习对门控质谱流式细胞术数据进行分类。
问题:通过量化最优传输从单细胞组学数据高效计算样本水平距离矩阵
bioRxiv. 2024 Feb 6:2024.02.06.578032. doi: 10.1101/2024.02.06.578032.
4
Automatic, fast, hierarchical, and non-overlapping gating of flow cytometric data with flowEMMi v2.使用flowEMMi v2对流式细胞术数据进行自动、快速、分层且不重叠的门控
Comput Struct Biotechnol J. 2022 Nov 17;20:6473-6489. doi: 10.1016/j.csbj.2022.11.033. eCollection 2022.
5
Determining clinically relevant features in cytometry data using persistent homology.使用持久同调确定流式细胞术数据中的临床相关特征。
PLoS Comput Biol. 2022 Mar 21;18(3):e1009931. doi: 10.1371/journal.pcbi.1009931. eCollection 2022 Mar.
Bioinformatics. 2017 Nov 1;33(21):3423-3430. doi: 10.1093/bioinformatics/btx448.
4
mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models.mclust 5:使用高斯有限混合模型进行聚类、分类和密度估计
R J. 2016 Aug;8(1):289-317.
5
Immunophenotype Discovery, Hierarchical Organization, and Template-Based Classification of Flow Cytometry Samples.流式细胞术样本的免疫表型发现、层次组织及基于模板的分类
Front Oncol. 2016 Aug 31;6:188. doi: 10.3389/fonc.2016.00188. eCollection 2016.
6
Computational flow cytometry: helping to make sense of high-dimensional immunology data.计算流式细胞术:帮助理解高维免疫学数据。
Nat Rev Immunol. 2016 Jul;16(7):449-62. doi: 10.1038/nri.2016.56. Epub 2016 Jun 20.
7
Earth Mover's Distance (EMD): A True Metric for Comparing Biomarker Expression Levels in Cell Populations.推土机距离(EMD):一种用于比较细胞群体中生物标志物表达水平的真正度量标准。
PLoS One. 2016 Mar 23;11(3):e0151859. doi: 10.1371/journal.pone.0151859. eCollection 2016.
8
Mapping cell populations in flow cytometry data for cross-sample comparison using the Friedman-Rafsky test statistic as a distance measure.使用 Friedman-Rafsky 检验统计量作为距离度量,对流式细胞术数据中的细胞群体进行映射,以进行跨样本比较。
Cytometry A. 2016 Jan;89(1):71-88. doi: 10.1002/cyto.a.22735. Epub 2015 Aug 14.
9
CCAST: a model-based gating strategy to isolate homogeneous subpopulations in a heterogeneous population of single cells.CCAST:一种基于模型的门控策略,用于从单细胞异质群体中分离同质亚群。
PLoS Comput Biol. 2014 Jul 31;10(7):e1003664. doi: 10.1371/journal.pcbi.1003664. eCollection 2014 Jul.
10
Critical assessment of automated flow cytometry data analysis techniques.自动化流式细胞术数据分析技术的批判性评估。
Nat Methods. 2013 Mar;10(3):228-38. doi: 10.1038/nmeth.2365. Epub 2013 Feb 10.