• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

计数数据的双变量零膨胀回归:一种贝叶斯方法及其在植物计数中的应用

Bivariate zero-inflated regression for count data: a Bayesian approach with application to plant counts.

作者信息

Majumdar Anandamayee, Gries Corinna

机构信息

Arizona State University, AZ, USA.

出版信息

Int J Biostat. 2010;6(1):Article 27. doi: 10.2202/1557-4679.1229.

DOI:10.2202/1557-4679.1229
PMID:21969981
Abstract

Lately, bivariate zero-inflated (BZI) regression models have been used in many instances in the medical sciences to model excess zeros. Examples include the BZI Poisson (BZIP), BZI negative binomial (BZINB) models, etc. Such formulations vary in the basic modeling aspect and use the EM algorithm (Dempster, Laird and Rubin, 1977) for parameter estimation. A different modeling formulation in the Bayesian context is given by Dagne (2004). We extend the modeling to a more general setting for multivariate ZIP models for count data with excess zeros as proposed by Li, Lu, Park, Kim, Brinkley and Peterson (1999), focusing on a particular bivariate regression formulation. For the basic formulation in the case of bivariate data, we assume that Xi are (latent) independent Poisson random variables with parameters λ i, i = 0, 1, 2. A bi-variate count vector (Y1, Y2) response follows a mixture of four distributions; p0 stands for the mixing probability of a point mass distribution at (0, 0); p1, the mixing probability that Y2 = 0, while Y1 = X0 + X1; p2, the mixing probability that Y1 = 0 while Y2 = X0 + X2; and finally (1 - p0 - p1 - p2), the mixing probability that Yi = Xi + X0, i = 1, 2. The choice of the parameters {pi, λ i, i = 0, 1, 2} ensures that the marginal distributions of Yi are zero inflated Poisson (λ 0 + λ i). All the parameters thus introduced are allowed to depend on co-variates through canonical link generalized linear models (McCullagh and Nelder, 1989). This flexibility allows for a range of real-life applications, especially in the medical and biological fields, where the counts are bivariate in nature (with strong association between the processes) and where there are excess of zeros in one or both processes. Our contribution in this paper is to employ a fully Bayesian approach consolidating the work of Dagne (2004) and Li et al. (1999) generalizing the modeling and sampling-based methods described by Ghosh, Mukhopadhyay and Lu (2006) to estimate the parameters and obtain posterior credible intervals both in the case where co-variates are not available as well as in the case where they are. In this context, we provide explicit data augmentation techniques that lend themselves to easier implementation of the Gibbs sampler by giving rise to well-known and closed-form posterior distributions in the bivariate ZIP case. We then use simulations to explore the effectiveness of this estimation using the Bayesian BZIP procedure, comparing the performance to the Bayesian and classical ZIP approaches. Finally, we demonstrate the methodology based on bivariate plant count data with excess zeros that was collected on plots in the Phoenix metropolitan area and compare the results with independent ZIP regression models fitted to both processes.

摘要

最近,双变量零膨胀(BZI)回归模型在医学领域的许多情况下被用于对过多的零值进行建模。例如双变量零膨胀泊松(BZIP)模型、双变量零膨胀负二项式(BZINB)模型等。这些模型在基本建模方面有所不同,并使用期望最大化(EM)算法(Dempster、Laird和Rubin,1977)进行参数估计。Dagne(2004)给出了贝叶斯背景下一种不同的建模公式。我们将建模扩展到更一般的多元零膨胀泊松(ZIP)模型设置,用于具有过多零值的计数数据,如Li、Lu、Park、Kim、Brinkley和Peterson(1999)所提出的,重点关注一种特定的双变量回归公式。对于双变量数据的基本公式,我们假设Xi是参数为λi的(潜在)独立泊松随机变量,i = 0, 1, 2。双变量计数向量(Y1, Y2)响应遵循四种分布的混合;p0代表在(0, 0)处点质量分布的混合概率;p1是Y2 = 0而Y1 = X0 + X1的混合概率;p2是Y1 = 0而Y2 = X0 + X2的混合概率;最后(1 - p0 - p1 - p2)是Yi = Xi + X0(i = 1, 2)的混合概率。参数{pi, λi, i = 0, 1, 2}的选择确保Yi的边际分布是零膨胀泊松分布(λ0 + λi)。所有这些引入的参数都可以通过规范链接广义线性模型(McCullagh和Nelder,1989)依赖于协变量。这种灵活性允许一系列实际应用,特别是在医学和生物学领域,其中计数本质上是双变量的(过程之间有很强的关联),并且在一个或两个过程中存在过多的零值。我们在本文中的贡献是采用一种完全贝叶斯方法,整合了Dagne(2004)和Li等人(1999)的工作,将Ghosh、Mukhopadhyay和Lu(2006)描述的建模和基于抽样的方法进行了推广,以估计参数,并在协变量不可用以及协变量可用的情况下都获得后验可信区间。在这种情况下,我们提供了明确的数据增强技术,通过在双变量ZIP情况下产生著名的封闭形式后验分布,使吉布斯采样器更易于实现。然后我们使用模拟来探索使用贝叶斯BZIP程序进行估计的有效性,并将性能与贝叶斯和经典ZIP方法进行比较。最后,我们基于在凤凰城大都市区的地块上收集的具有过多零值的双变量植物计数数据展示了该方法,并将结果与拟合到两个过程的独立ZIP回归模型进行比较。

相似文献

1
Bivariate zero-inflated regression for count data: a Bayesian approach with application to plant counts.计数数据的双变量零膨胀回归:一种贝叶斯方法及其在植物计数中的应用
Int J Biostat. 2010;6(1):Article 27. doi: 10.2202/1557-4679.1229.
2
Zero-inflated Poisson and binomial regression with random effects: a case study.具有随机效应的零膨胀泊松和二项式回归:一个案例研究。
Biometrics. 2000 Dec;56(4):1030-9. doi: 10.1111/j.0006-341x.2000.01030.x.
3
Poisson, Poisson-gamma and zero-inflated regression models of motor vehicle crashes: balancing statistical fit and theory.机动车碰撞事故的泊松、泊松-伽马和零膨胀回归模型:平衡统计拟合与理论
Accid Anal Prev. 2005 Jan;37(1):35-46. doi: 10.1016/j.aap.2004.02.004.
4
Marginalized multilevel hurdle and zero-inflated models for overdispersed and correlated count data with excess zeros.用于具有过多零值的过度分散和相关计数数据的边缘化多级障碍模型和零膨胀模型。
Stat Med. 2014 Nov 10;33(25):4402-19. doi: 10.1002/sim.6237. Epub 2014 Jun 23.
5
The k-ZIG: flexible modeling for zero-inflated counts.k-ZIG:零膨胀计数的灵活建模
Biometrics. 2012 Sep;68(3):878-85. doi: 10.1111/j.1541-0420.2011.01729.x. Epub 2012 Feb 20.
6
On performance of parametric and distribution-free models for zero-inflated and over-dispersed count responses.关于零膨胀和过度分散计数响应的参数模型和非参数模型的性能。
Stat Med. 2015 Oct 30;34(24):3235-45. doi: 10.1002/sim.6560. Epub 2015 Jun 15.
7
Modelling bivariate count series with excess zeros.对具有过多零值的双变量计数序列进行建模。
Math Biosci. 2005 Aug;196(2):226-37. doi: 10.1016/j.mbs.2005.05.001.
8
Nonlinear mixed-effects modeling of longitudinal count data: Bayesian inference about median counts based on the marginal zero-inflated discrete Weibull distribution.基于边缘零膨胀离散 Weibull 分布的纵向计数数据的非线性混合效应建模:基于边缘零膨胀离散 Weibull 分布的中位数计数的贝叶斯推断。
Stat Med. 2021 Oct 15;40(23):5078-5095. doi: 10.1002/sim.9112. Epub 2021 Jun 21.
9
Modelling count data with excessive zeros: the need for class prediction in zero-inflated models and the issue of data generation in choosing between zero-inflated and generic mixture models for dental caries data.对过多零值进行计数数据分析:零膨胀模型中类别预测的必要性,以及针对龋齿数据在零膨胀模型和通用混合模型之间选择时的数据生成问题。
Stat Med. 2009 Dec 10;28(28):3539-53. doi: 10.1002/sim.3699.
10
A robust Bayesian mixed effects approach for zero inflated and highly skewed longitudinal count data emanating from the zero inflated discrete Weibull distribution.一种针对源自零膨胀离散威布尔分布的零膨胀和高度偏态纵向计数数据的稳健贝叶斯混合效应方法。
Stat Med. 2020 Apr 30;39(9):1275-1291. doi: 10.1002/sim.8475. Epub 2020 Feb 24.

引用本文的文献

1
Modeling dragonfly population data with a Bayesian bivariate geometric mixed-effects model.使用贝叶斯双变量几何混合效应模型对蜻蜓种群数据进行建模。
J Appl Stat. 2022 May 6;50(10):2171-2193. doi: 10.1080/02664763.2022.2068513. eCollection 2023.