• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

中心划分过程:聚类的信息先验(附讨论)

Centered Partition Processes: Informative Priors for Clustering (with Discussion).

作者信息

Paganin Sally, Herring Amy H, Olshan Andrew F, Dunson David B

机构信息

Department of Environmental Science, Policy, and Management, University of California, Berkeley.

Department of Statistical Science, Duke University, Durham.

出版信息

Bayesian Anal. 2021 Mar;16(1):301-370. doi: 10.1214/20-BA1197. Epub 2020 Feb 13.

DOI:10.1214/20-BA1197
PMID:35958029
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9364237/
Abstract

There is a very rich literature proposing Bayesian approaches for clustering starting with a prior probability distribution on partitions. Most approaches assume exchangeability, leading to simple representations in terms of Exchangeable Partition Probability Functions (EPPF). Gibbs-type priors encompass a broad class of such cases, including Dirichlet and Pitman-Yor processes. Even though there have been some proposals to relax the exchangeability assumption, allowing covariate-dependence and partial exchangeability, limited consideration has been given on how to include concrete prior knowledge on the partition. For example, we are motivated by an epidemiological application, in which we wish to cluster birth defects into groups and we have prior knowledge of an initial clustering provided by experts. As a general approach for including such prior knowledge, we propose a Centered Partition (CP) process that modifies the EPPF to favor partitions close to an initial one. Some properties of the CP prior are described, a general algorithm for posterior computation is developed, and we illustrate the methodology through simulation examples and an application to the motivating epidemiology study of birth defects.

摘要

有大量丰富的文献提出了用于聚类的贝叶斯方法,这些方法从分区上的先验概率分布开始。大多数方法假定可交换性,从而在可交换分区概率函数(EPPF)方面产生简单的表示形式。吉布斯型先验涵盖了这类情况中的一大类,包括狄利克雷过程和皮特曼 - 约尔过程。尽管已经有一些提议放宽可交换性假设,允许协变量依赖性和部分可交换性,但对于如何纳入关于分区的具体先验知识的考虑却很有限。例如,我们受到一项流行病学应用的启发,在该应用中,我们希望将出生缺陷聚类成组,并且我们拥有专家提供的初始聚类的先验知识。作为纳入此类先验知识的一般方法,我们提出了一种中心分区(CP)过程,该过程修改EPPF以支持接近初始分区的分区。描述了CP先验的一些性质,开发了一种用于后验计算的通用算法,并且我们通过模拟示例以及将其应用于关于出生缺陷的激励性流行病学研究来说明该方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/aa1c7b33751a/nihms-1815470-f0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/72efc64cdb42/nihms-1815470-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/812b2e702138/nihms-1815470-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/ae62c4a2ef6f/nihms-1815470-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/cc93328c90cb/nihms-1815470-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/6e6d81a76ca8/nihms-1815470-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/ac487414d389/nihms-1815470-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/f8a6be97320f/nihms-1815470-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/5505aef8daaa/nihms-1815470-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/151001a5ab39/nihms-1815470-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/ac0c0b49f7da/nihms-1815470-f0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/aa1c7b33751a/nihms-1815470-f0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/72efc64cdb42/nihms-1815470-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/812b2e702138/nihms-1815470-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/ae62c4a2ef6f/nihms-1815470-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/cc93328c90cb/nihms-1815470-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/6e6d81a76ca8/nihms-1815470-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/ac487414d389/nihms-1815470-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/f8a6be97320f/nihms-1815470-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/5505aef8daaa/nihms-1815470-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/151001a5ab39/nihms-1815470-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/ac0c0b49f7da/nihms-1815470-f0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7b/9364237/aa1c7b33751a/nihms-1815470-f0011.jpg

相似文献

1
Centered Partition Processes: Informative Priors for Clustering (with Discussion).中心划分过程:聚类的信息先验(附讨论)
Bayesian Anal. 2021 Mar;16(1):301-370. doi: 10.1214/20-BA1197. Epub 2020 Feb 13.
2
Are Gibbs-Type Priors the Most Natural Generalization of the Dirichlet Process?吉布斯先验是否是狄利克雷过程最自然的推广?
IEEE Trans Pattern Anal Mach Intell. 2015 Feb;37(2):212-29. doi: 10.1109/TPAMI.2013.217.
3
Latent Nested Nonparametric Priors (with Discussion).潜在嵌套非参数先验(附讨论)
Bayesian Anal. 2019 Dec;14(4):1303-1356. doi: 10.1214/19-BA1169. Epub 2019 Jun 27.
4
Generalized species sampling priors with latent Beta reinforcements.具有潜在贝塔增强的广义物种抽样先验。
J Am Stat Assoc. 2014 Dec 1;109(508):1466-1480. doi: 10.1080/01621459.2014.950735.
5
Random Partition Distribution Indexed by Pairwise Information.基于成对信息索引的随机划分分布指数
J Am Stat Assoc. 2017;112(518):721-732. doi: 10.1080/01621459.2016.1165103. Epub 2017 Apr 12.
6
Clustering blood donors via mixtures of product partition models with covariates.基于带协变量的产品划分模型的混合对献血者进行聚类。
Biometrics. 2024 Jan 29;80(1). doi: 10.1093/biomtc/ujad021.
7
Dirichlet-Laplace priors for optimal shrinkage.用于最优收缩的狄利克雷-拉普拉斯先验
J Am Stat Assoc. 2015 Dec 1;110(512):1479-1490. doi: 10.1080/01621459.2014.960967. Epub 2014 Sep 25.
8
Finding the mean in a partition distribution.求分区分布的均值。
BMC Bioinformatics. 2018 Oct 12;19(1):375. doi: 10.1186/s12859-018-2359-z.
9
Pitman Yor Diffusion Trees for Bayesian Hierarchical Clustering.基于贝叶斯层次聚类的皮特曼-约扩散树。
IEEE Trans Pattern Anal Mach Intell. 2015 Feb;37(2):271-89. doi: 10.1109/TPAMI.2014.2313115.
10
Perfect Sampling of the Posterior in the Hierarchical Pitman-Yor Process.层次化皮特曼 - 约尔过程中后验的完美抽样
Bayesian Anal. 2022 Sep;17(3):685-709. doi: 10.1214/21-ba1269. Epub 2021 Apr 27.

引用本文的文献

1
Dimension-Grouped Mixed Membership Models for Multivariate Categorical Data.多元分类数据的维度分组混合成员模型
J Mach Learn Res. 2023 Feb;24.
2
Spectral Clustering, Bayesian Spanning Forest, and Forest Process.谱聚类、贝叶斯生成森林和森林过程。
J Am Stat Assoc. 2024;119(547):2140-2153. doi: 10.1080/01621459.2023.2250098. Epub 2023 Sep 29.
3
Clustering computer mouse tracking data with informed hierarchical shrinkage partition priors.基于信息分层收缩分区先验的计算机鼠标追踪数据聚类。

本文引用的文献

1
Random Partition Distribution Indexed by Pairwise Information.基于成对信息索引的随机划分分布指数
J Am Stat Assoc. 2017;112(518):721-732. doi: 10.1080/01621459.2016.1165103. Epub 2017 Apr 12.
2
Data augmentation for models based on rejection sampling.基于拒绝采样的模型的数据增强。
Biometrika. 2016 Jun;103(2):319-335. doi: 10.1093/biomet/asw005. Epub 2016 May 6.
3
Are Gibbs-Type Priors the Most Natural Generalization of the Dirichlet Process?吉布斯先验是否是狄利克雷过程最自然的推广?
Biometrics. 2024 Oct 3;80(4). doi: 10.1093/biomtc/ujae124.
4
Bayesian cluster analysis.贝叶斯聚类分析。
Philos Trans A Math Phys Eng Sci. 2023 May 15;381(2247):20220149. doi: 10.1098/rsta.2022.0149. Epub 2023 Mar 27.
5
MULTIVARIATE MIXED MEMBERSHIP MODELING: INFERRING DOMAIN-SPECIFIC RISK PROFILES.多变量混合成员模型:推断特定领域的风险概况。
Ann Appl Stat. 2022 Mar;16(1):391-413. doi: 10.1214/21-aoas1496. Epub 2022 Mar 28.
IEEE Trans Pattern Anal Mach Intell. 2015 Feb;37(2):212-29. doi: 10.1109/TPAMI.2013.217.
4
Specific SSRIs and birth defects: Bayesian analysis to interpret new data in the context of previous reports.特定的选择性5-羟色胺再摄取抑制剂与出生缺陷:在既往报告背景下解读新数据的贝叶斯分析
BMJ. 2015 Jul 8;351:h3190. doi: 10.1136/bmj.h3190.
5
The protective effects of nausea and vomiting of pregnancy against adverse fetal outcome--a systematic review.妊娠恶心和呕吐对不良胎儿结局的保护作用——一项系统评价。
Reprod Toxicol. 2014 Aug;47:77-80. doi: 10.1016/j.reprotox.2014.05.012. Epub 2014 Jun 2.
6
Nonparametric Bayesian models through probit stick-breaking processes.通过概率单位折断过程的非参数贝叶斯模型。
Bayesian Anal. 2011 Mar 1;6(1). doi: 10.1214/11-BA605.
7
A Product Partition Model With Regression on Covariates.一种带有协变量回归的产品划分模型。
J Comput Graph Stat. 2011 Mar 1;20(1):260-278. doi: 10.1198/jcgs.2011.09066.
8
Bayesian semiparametric multiple shrinkage.贝叶斯半参数多重收缩法
Biometrics. 2010 Jun;66(2):455-62. doi: 10.1111/j.1541-0420.2009.01275.x. Epub 2009 Jun 8.
9
Bayesian hierarchical functional data analysis via contaminated informative priors.通过受污染的信息先验进行贝叶斯分层函数数据分析。
Biometrics. 2009 Sep;65(3):772-80. doi: 10.1111/j.1541-0420.2008.01163.x. Epub 2009 Jan 23.
10
Kernel stick-breaking processes.核折断过程
Biometrika. 2008;95(2):307-323. doi: 10.1093/biomet/asn012.