概率区间比较：一种用于量化多元分布差异的指标。

Probability binning comparison: a metric for quantitating multivariate distribution differences.

作者信息

Roederer M, Moore W, Treister A, Hardy R R, Herzenberg L A

机构信息

Vaccine Research Center, NIH, Bethesda, Maryland 20892-3015, USA.

出版信息

Cytometry. 2001 Sep 1;45(1):47-55. doi: 10.1002/1097-0320(20010901)45:1<47::aid-cyto1143>3.0.co;2-a.

DOI:10.1002/1097-0320(20010901)45:1<47::aid-cyto1143>3.0.co;2-a

PMID:11598946

Abstract

BACKGROUND

While several algorithms for the comparison of univariate distributions arising from flow cytometric analyses have been developed and studied for many years, algorithms for comparing multivariate distributions remain elusive. Such algorithms could be useful for comparing differences between samples based on several independent measurements, rather than differences based on any single measurement. It is conceivable that distributions could be completely distinct in multivariate space, but unresolvable in any combination of univariate histograms. Multivariate comparisons could also be useful for providing feedback about instrument stability, when only subtle changes in measurements are occurring.

METHODS

We apply a variant of Probability Binning, described in the accompanying article, to multidimensional data. In this approach, hyper-rectangles of n dimensions (where n is the number of measurements being compared) comprise the bins used for the chi-squared statistic. These hyper-dimensional bins are constructed such that the control sample has the same number of events in each bin; the bins are then applied to the test samples for chi-squared calculations.

RESULTS

Using a Monte-Carlo simulation, we determined the distribution of chi-squared values obtained by comparing sets of events from the same distribution; this distribution of chi-squared values was identical as for the univariate algorithm. Hence, the same formulae can be used to construct a metric, analogous to a t-score, that estimates the probability with which distributions are distinct. As for univariate comparisons, this metric scales with the difference between two distributions, and can be used to rank samples according to similarity to a control. We apply the algorithm to multivariate immunophenotyping data, and demonstrate that it can be used to discriminate distinct samples and to rank samples according to a biologically-meaningful difference.

CONCLUSION

Probability binning, as shown here, provides a useful metric for determining the probability with which two or more multivariate distributions represent distinct sets of data. The metric can be used to identify the similarity or dissimilarity of samples. Finally, as demonstrated in the accompanying paper, the algorithm can be used to gate on events in one sample that are different from a control sample, even if those events cannot be distinguished on the basis of any combination of univariate or bivariate displays. Published 2001 Wiley-Liss, Inc.

摘要

背景

尽管多年来已经开发并研究了几种用于比较流式细胞术分析中产生的单变量分布的算法，但用于比较多变量分布的算法仍然难以捉摸。此类算法对于基于多个独立测量来比较样本之间的差异可能很有用，而不是基于任何单个测量的差异。可以想象，分布在多变量空间中可能完全不同，但在任何单变量直方图组合中都无法分辨。当测量中仅发生细微变化时，多变量比较对于提供有关仪器稳定性的反馈也可能很有用。

方法

我们将随附文章中描述的概率分箱变体应用于多维数据。在这种方法中，n维超矩形（其中n是正在比较的测量数量）构成用于卡方统计量的箱。这些超维箱的构建方式是使对照样本在每个箱中具有相同数量的事件；然后将这些箱应用于测试样本以进行卡方计算。

结果

使用蒙特卡罗模拟，我们确定了通过比较来自相同分布的事件集获得的卡方值的分布；此卡方值分布与单变量算法的分布相同。因此，可以使用相同的公式来构建一个类似于t分数的度量，该度量估计分布不同的概率。与单变量比较一样，此度量随两个分布之间的差异而缩放，可用于根据与对照的相似性对样本进行排名。我们将该算法应用于多变量免疫表型数据，并证明它可用于区分不同的样本，并根据生物学上有意义的差异对样本进行排名。

结论

如此处所示，概率分箱为确定两个或多个多变量分布代表不同数据集的概率提供了一个有用的度量。该度量可用于识别样本的相似性或不相似性。最后，如随附论文中所示，该算法可用于在一个样本中对与对照样本不同的事件进行门控，即使这些事件无法根据任何单变量或双变量显示的组合来区分。2001年由Wiley-Liss公司出版。

相似文献

Probability binning comparison: a metric for quantitating multivariate distribution differences.

Cytometry. 2001 Sep 1;45(1):47-55. doi: 10.1002/1097-0320(20010901)45:1<47::aid-cyto1143>3.0.co;2-a.

Probability binning comparison: a metric for quantitating univariate distribution differences.

Cytometry. 2001 Sep 1;45(1):37-46. doi: 10.1002/1097-0320(20010901)45:1<37::aid-cyto1142>3.0.co;2-e.

Frequency difference gating: a multivariate method for identifying subsets that differ between samples.

Cytometry. 2001 Sep 1;45(1):56-64. doi: 10.1002/1097-0320(20010901)45:1<56::aid-cyto1144>3.0.co;2-9.

Probability binning and testing agreement between multivariate immunofluorescence histograms: extending the chi-squared test.

Cytometry. 2001 Oct 1;45(2):141-50. doi: 10.1002/1097-0320(20011001)45:2<141::aid-cyto1156>3.0.co;2-m.

Quadratic form: a robust metric for quantitative comparison of flow cytometric histograms.

Cytometry A. 2008 Aug;73(8):715-26. doi: 10.1002/cyto.a.20586.

Sequential univariate gating approach to study the effects of erythropoietin in murine bone marrow.

Cytometry A. 2008 Aug;73(8):702-14. doi: 10.1002/cyto.a.20584.

Profiling of polychromatic flow cytometry data on B-cells reveals patients' clusters in common variable immunodeficiency.

Cytometry A. 2009 Nov;75(11):902-9. doi: 10.1002/cyto.a.20801.

Multiway contingency tables: Monte Carlo resampling probability values for the chi-squared and likelihood-ratio tests.

Psychol Rep. 2010 Oct;107(2):501-10. doi: 10.2466/03.PR0.107.5.501-510.

Mapping cell populations in flow cytometry data for cross-sample comparison using the Friedman-Rafsky test statistic as a distance measure.

Cytometry A. 2016 Jan;89(1):71-88. doi: 10.1002/cyto.a.22735. Epub 2015 Aug 14.

Cytometric fingerprinting: quantitative characterization of multivariate distributions.

Cytometry A. 2008 May;73(5):430-41. doi: 10.1002/cyto.a.20545.

引用本文的文献

TEAM: A MULTIPLE TESTING ALGORITHM ON THE AGGREGATION TREE FOR FLOW CYTOMETRY ANALYSIS.

Ann Appl Stat. 2023 Mar;17(1):621-640. doi: 10.1214/22-aoas1645. Epub 2023 Jan 24.

Competition quenching strategies reduce antibiotic tolerance in polymicrobial biofilms.

NPJ Biofilms Microbiomes. 2024 Mar 19;10(1):23. doi: 10.1038/s41522-024-00489-6.

Isolation and monoculture of functional primary astrocytes from the adult mouse spinal cord.

Front Neurosci. 2024 Feb 16;18:1367473. doi: 10.3389/fnins.2024.1367473. eCollection 2024.

Deciphering variations in the endocytic uptake of a cell-penetrating peptide: the crucial role of cell culture protocols.

Cytotechnology. 2023 Dec;75(6):473-490. doi: 10.1007/s10616-023-00591-1. Epub 2023 Sep 8.

Role of anti-polyethylene glycol (PEG) antibodies in the allergic reactions to PEG-containing Covid-19 vaccines: Evidence for immunogenicity of PEG.

Vaccine. 2023 Jul 12;41(31):4561-4570. doi: 10.1016/j.vaccine.2023.06.009. Epub 2023 Jun 5.

Cold shock domain-containing protein E1 is a posttranscriptional regulator of the LDL receptor.

Sci Transl Med. 2022 Sep 14;14(662):eabj8670. doi: 10.1126/scitranslmed.abj8670.

Contribution of classical complement activation and IgM to the control of Rickettsia infection.

Mol Microbiol. 2021 Dec;116(6):1476-1488. doi: 10.1111/mmi.14839. Epub 2021 Nov 13.

Computational Analysis of Microbial Flow Cytometry Data.

mSystems. 2021 Jan 19;6(1):e00895-20. doi: 10.1128/mSystems.00895-20.

Biofilm Bacteria Use Stress Responses to Detect and Respond to Competitors.

Curr Biol. 2020 Apr 6;30(7):1231-1244.e4. doi: 10.1016/j.cub.2020.01.065. Epub 2020 Feb 20.

flowEMMi: an automated model-based clustering tool for microbial cytometric data.

BMC Bioinformatics. 2019 Dec 9;20(1):643. doi: 10.1186/s12859-019-3152-3.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

概率区间比较：一种用于量化多元分布差异的指标。

Probability binning comparison: a metric for quantitating multivariate distribution differences.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献