• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用贝叶斯估计识别混合混合物。

Identifying Mixtures of Mixtures Using Bayesian Estimation.

作者信息

Malsiner-Walli Gertraud, Frühwirth-Schnatter Sylvia, Grün Bettina

机构信息

Department of Applied Statistics, Johannes Kepler University, Linz, Austria.

Institute of Statistics and Mathematics, Wirtschaftsuniversität, Wien, Austria.

出版信息

J Comput Graph Stat. 2017 Apr 3;26(2):285-295. doi: 10.1080/10618600.2016.1200472. Epub 2017 Apr 24.

DOI:10.1080/10618600.2016.1200472
PMID:28626349
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5455957/
Abstract

The use of a finite mixture of normal distributions in model-based clustering allows us to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing procedures. Within the Bayesian framework, we propose a different approach based on sparse finite mixtures to achieve identifiability. We specify a hierarchical prior, where the hyperparameters are carefully selected such that they are reflective of the cluster structure aimed at. In addition, this prior allows us to estimate the model using standard MCMC sampling methods. In combination with a post-processing approach which resolves the label switching issue and results in an identified model, our approach allows us to simultaneously (1) determine the number of clusters, (2) flexibly approximate the cluster distributions in a semiparametric way using finite mixtures of normals and (3) identify cluster-specific parameters and classify observations. The proposed approach is illustrated in two simulation studies and on benchmark datasets. Supplementary materials for this article are available online.

摘要

在基于模型的聚类中使用正态分布的有限混合,使我们能够捕捉非高斯数据聚类。然而,从正态分量中识别聚类具有挑战性,通常要么通过对模型施加约束,要么通过使用后处理程序来实现。在贝叶斯框架内,我们提出了一种基于稀疏有限混合的不同方法来实现可识别性。我们指定了一个层次先验,其中超参数经过精心选择,使其反映目标聚类结构。此外,这个先验允许我们使用标准的MCMC采样方法来估计模型。结合一种解决标签切换问题并得到已识别模型的后处理方法,我们的方法使我们能够同时(1)确定聚类的数量,(2)使用正态分布的有限混合以半参数方式灵活地近似聚类分布,以及(3)识别特定于聚类的参数并对观测值进行分类。所提出的方法在两个模拟研究和基准数据集上进行了说明。本文的补充材料可在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05ff/5455957/b32da85dc2b9/ucgs_a_1200472_f0003_c.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05ff/5455957/efceac6cff54/ucgs_a_1200472_f0001_b.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05ff/5455957/488cd27c203e/ucgs_a_1200472_f0002_c.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05ff/5455957/b32da85dc2b9/ucgs_a_1200472_f0003_c.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05ff/5455957/efceac6cff54/ucgs_a_1200472_f0001_b.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05ff/5455957/488cd27c203e/ucgs_a_1200472_f0002_c.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05ff/5455957/b32da85dc2b9/ucgs_a_1200472_f0003_c.jpg

相似文献

1
Identifying Mixtures of Mixtures Using Bayesian Estimation.使用贝叶斯估计识别混合混合物。
J Comput Graph Stat. 2017 Apr 3;26(2):285-295. doi: 10.1080/10618600.2016.1200472. Epub 2017 Apr 24.
2
Model-based clustering based on sparse finite Gaussian mixtures.基于稀疏有限高斯混合模型的聚类分析
Stat Comput. 2016;26(1):303-324. doi: 10.1007/s11222-014-9500-2. Epub 2014 Aug 26.
3
From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering.从这里到无穷:基于模型的聚类中稀疏有限混合模型与狄利克雷过程混合模型
Adv Data Anal Classif. 2019;13(1):33-64. doi: 10.1007/s11634-018-0329-y. Epub 2018 Aug 24.
4
Overfitting Bayesian Mixture Models with an Unknown Number of Components.具有未知组件数量的过拟合贝叶斯混合模型。
PLoS One. 2015 Jul 15;10(7):e0131739. doi: 10.1371/journal.pone.0131739. eCollection 2015.
5
Semiparametric Mixed-Effects Ordinary Differential Equation Models with Heavy-Tailed Distributions.具有重尾分布的半参数混合效应常微分方程模型
J Agric Biol Environ Stat. 2021;26(3):428-445. doi: 10.1007/s13253-021-00446-2. Epub 2021 Apr 5.
6
A Dirichlet process mixture of generalized Dirichlet distributions for proportional data modeling.用于比例数据建模的广义狄利克雷分布的狄利克雷过程混合模型。
IEEE Trans Neural Netw. 2010 Jan;21(1):107-22. doi: 10.1109/TNN.2009.2034851. Epub 2009 Dec 4.
7
Fast Bayesian Inference in Dirichlet Process Mixture Models.狄利克雷过程混合模型中的快速贝叶斯推理
J Comput Graph Stat. 2011 Jan 1;20(1). doi: 10.1198/jcgs.2010.07081.
8
Bayesian Kernel Mixtures for Counts.用于计数的贝叶斯核混合模型
J Am Stat Assoc. 2011 Dec 1;106(496):1528-1539. doi: 10.1198/jasa.2011.tm10552. Epub 2012 Jan 24.
9
Bayesian inference for finite mixtures of univariate and multivariate skew-normal and skew-t distributions.贝叶斯推断用于单变量和多变量偏斜正态和偏斜 t 分布的有限混合。
Biostatistics. 2010 Apr;11(2):317-36. doi: 10.1093/biostatistics/kxp062. Epub 2010 Jan 27.
10
Two-stage hierarchical modeling for analysis of subpopulations in conditional distributions.用于分析条件分布中亚群体的两阶段分层建模。
J Appl Stat. 2012;39(2):445-460. doi: 10.1080/02664763.2011.596193. Epub 2011 Dec 16.

引用本文的文献

1
BAYESIAN LEARNING OF CLINICALLY MEANINGFUL SEPSIS PHENOTYPES IN NORTHERN TANZANIA.坦桑尼亚北部临床上有意义的脓毒症表型的贝叶斯学习
Ann Appl Stat. 2025 Sep;19(3):2193-2217. doi: 10.1214/25-aoas2045. Epub 2025 Aug 28.
2
Enriched Pitman-Yor processes.富集皮特曼 - 约尔过程
Scand Stat Theory Appl. 2025 Jun;52(2):631-657. doi: 10.1111/sjos.12765. Epub 2025 Jan 19.
3
Spectral Clustering, Bayesian Spanning Forest, and Forest Process.谱聚类、贝叶斯生成森林和森林过程。

本文引用的文献

1
PReMiuM: An R Package for Profile Regression Mixture Models Using Dirichlet Processes.PReMiuM:一个使用狄利克雷过程的轮廓回归混合模型的R包。
J Stat Softw. 2015 Mar 20;64(7):1-30. doi: 10.18637/jss.v064.i07.
2
Model-based clustering based on sparse finite Gaussian mixtures.基于稀疏有限高斯混合模型的聚类分析
Stat Comput. 2016;26(1):303-324. doi: 10.1007/s11222-014-9500-2. Epub 2014 Aug 26.
3
Mixtures of Shifted AsymmetricLaplace Distributions.平移非对称拉普拉斯分布的混合。
J Am Stat Assoc. 2024;119(547):2140-2153. doi: 10.1080/01621459.2023.2250098. Epub 2023 Sep 29.
4
Bayesian cluster analysis.贝叶斯聚类分析。
Philos Trans A Math Phys Eng Sci. 2023 May 15;381(2247):20220149. doi: 10.1098/rsta.2022.0149. Epub 2023 Mar 27.
5
Semi-Supervised Non-Parametric Bayesian Modelling of Spatial Proteomics.空间蛋白质组学的半监督非参数贝叶斯建模
Ann Appl Stat. 2022 Dec 1;16(4). doi: 10.1214/22-AOAS1603.
6
Dynamic Dirichlet process mixture model for identifying voting coalitions in the United Nations General Assembly human rights roll call votes.用于识别联合国大会人权唱名表决中投票联盟的动态狄利克雷过程混合模型。
J Appl Stat. 2021 May 20;49(12):3002-3021. doi: 10.1080/02664763.2021.1931820. eCollection 2022.
7
Estimating densities with non-linear support by using Fisher-Gaussian kernels.使用费舍尔-高斯核估计具有非线性支持的密度。
J R Stat Soc Series B Stat Methodol. 2020 Dec;82(5):1249-1271. doi: 10.1111/rssb.12390. Epub 2020 Aug 9.
8
Bayesian Distance Clustering.贝叶斯距离聚类
J Mach Learn Res. 2021 Jan-Dec;22.
9
Review of Bayesian selection methods for categorical predictors using JAGS.使用JAGS对分类预测变量的贝叶斯选择方法进行综述。
J Appl Stat. 2021 Mar 21;49(9):2370-2388. doi: 10.1080/02664763.2021.1902955. eCollection 2022.
10
Summarizing Finite Mixture Model with Overlapping Quantification.具有重叠量化的有限混合模型总结
Entropy (Basel). 2021 Nov 13;23(11):1503. doi: 10.3390/e23111503.
IEEE Trans Pattern Anal Mach Intell. 2014 Jun;36(6):1149-57. doi: 10.1109/TPAMI.2013.216.
4
Bayesian Nonparametric Inference - Why and How.贝叶斯非参数推断——为何及如何进行
Bayesian Anal. 2013;8(2). doi: 10.1214/13-BA811.
5
Combining Mixture Components for Clustering.组合混合成分用于聚类。
J Comput Graph Stat. 2010 Jun 1;9(2):332-353. doi: 10.1198/jcgs.2010.08111.
6
Bayesian inference for finite mixtures of univariate and multivariate skew-normal and skew-t distributions.贝叶斯推断用于单变量和多变量偏斜正态和偏斜 t 分布的有限混合。
Biostatistics. 2010 Apr;11(2):317-36. doi: 10.1093/biostatistics/kxp062. Epub 2010 Jan 27.
7
Statistical mixture modeling for cell subtype identification in flow cytometry.用于流式细胞术中细胞亚型识别的统计混合模型
Cytometry A. 2008 Aug;73(8):693-701. doi: 10.1002/cyto.a.20583.
8
High-content flow cytometry and temporal data analysis for defining a cellular signature of graft-versus-host disease.用于定义移植物抗宿主病细胞特征的高内涵流式细胞术和时间数据分析。
Biol Blood Marrow Transplant. 2007 Jun;13(6):691-700. doi: 10.1016/j.bbmt.2007.02.002. Epub 2007 Apr 6.
9
Bayesian mixture model based clustering of replicated microarray data.基于贝叶斯混合模型的重复微阵列数据聚类
Bioinformatics. 2004 May 22;20(8):1222-32. doi: 10.1093/bioinformatics/bth068. Epub 2004 Feb 10.