• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于对零膨胀计数数据进行建模的零膨胀模型和障碍模型的比较。

A comparison of zero-inflated and hurdle models for modeling zero-inflated count data.

作者信息

Feng Cindy Xin

机构信息

Department of Community Health and Epidemiology, Faculty of Medicine, Dalhousie University, 5790 University Avenue, Halifax, B3H 4R2 Nova Scotia Canada.

出版信息

J Stat Distrib Appl. 2021;8(1):8. doi: 10.1186/s40488-021-00121-4. Epub 2021 Jun 24.

DOI:10.1186/s40488-021-00121-4
PMID:34760432
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8570364/
Abstract

Counts data with excessive zeros are frequently encountered in practice. For example, the number of health services visits often includes many zeros representing the patients with no utilization during a follow-up time. A common feature of this type of data is that the count measure tends to have excessive zero beyond a common count distribution can accommodate, such as Poisson or negative binomial. Zero-inflated or hurdle models are often used to fit such data. Despite the increasing popularity of ZI and hurdle models, there is still a lack of investigation of the fundamental differences between these two types of models. In this article, we reviewed the zero-inflated and hurdle models and highlighted their differences in terms of their data generating processes. We also conducted simulation studies to evaluate the performances of both types of models. The final choice of regression model should be made after a careful assessment of goodness of fit and should be tailored to a particular data in question.

摘要

在实际应用中,经常会遇到含有大量零值的计数数据。例如,医疗服务就诊次数通常包含许多零值,这些零值代表在随访期间未使用医疗服务的患者。这类数据的一个共同特征是,计数指标往往有过多的零值,超出了泊松分布或负二项分布等常见计数分布所能容纳的范围。零膨胀模型或门槛模型通常用于拟合此类数据。尽管零膨胀模型和门槛模型越来越受欢迎,但对这两种模型之间的根本差异仍缺乏研究。在本文中,我们回顾了零膨胀模型和门槛模型,并强调了它们在数据生成过程方面的差异。我们还进行了模拟研究,以评估这两种模型的性能。回归模型的最终选择应在仔细评估拟合优度之后做出,并应根据具体的数据进行调整。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/773f0bd87c29/40488_2021_121_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/6379510ac614/40488_2021_121_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/31987f1ba1d0/40488_2021_121_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/304a9a53d005/40488_2021_121_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/52be1ce694dc/40488_2021_121_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/90ac2f5bec18/40488_2021_121_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/bc93e46a1670/40488_2021_121_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/b5034c1cd662/40488_2021_121_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/773f0bd87c29/40488_2021_121_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/6379510ac614/40488_2021_121_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/31987f1ba1d0/40488_2021_121_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/304a9a53d005/40488_2021_121_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/52be1ce694dc/40488_2021_121_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/90ac2f5bec18/40488_2021_121_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/bc93e46a1670/40488_2021_121_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/b5034c1cd662/40488_2021_121_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f714/8570364/773f0bd87c29/40488_2021_121_Fig8_HTML.jpg

相似文献

1
A comparison of zero-inflated and hurdle models for modeling zero-inflated count data.用于对零膨胀计数数据进行建模的零膨胀模型和障碍模型的比较。
J Stat Distrib Appl. 2021;8(1):8. doi: 10.1186/s40488-021-00121-4. Epub 2021 Jun 24.
2
Using zero-inflated and hurdle regression models to analyze schistosomiasis data of school children in the southern areas of Ghana.利用零膨胀和障碍回归模型分析加纳南部地区学龄儿童的血吸虫病数据。
PLoS One. 2024 Jul 12;19(7):e0304681. doi: 10.1371/journal.pone.0304681. eCollection 2024.
3
Marginalized multilevel hurdle and zero-inflated models for overdispersed and correlated count data with excess zeros.用于具有过多零值的过度分散和相关计数数据的边缘化多级障碍模型和零膨胀模型。
Stat Med. 2014 Nov 10;33(25):4402-19. doi: 10.1002/sim.6237. Epub 2014 Jun 23.
4
Models for analyzing zero-inflated and overdispersed count data: an application to cigarette and marijuana use.用于分析零膨胀和过度分散计数数据的模型:在香烟和大麻使用中的应用。
Nicotine Tob Res. 2018 Apr 18;22(8):1390-8. doi: 10.1093/ntr/nty072.
5
On the use of zero-inflated and hurdle models for modeling vaccine adverse event count data.关于使用零膨胀模型和障碍模型对疫苗不良事件计数数据进行建模
J Biopharm Stat. 2006;16(4):463-81. doi: 10.1080/10543400600719384.
6
Zero-inflated and hurdle models of count data with extra zeros: examples from an HIV-risk reduction intervention trial.带有额外零值的计数数据的零膨胀和障碍模型:来自 HIV 风险降低干预试验的实例。
Am J Drug Alcohol Abuse. 2011 Sep;37(5):367-75. doi: 10.3109/00952990.2011.597280.
7
A comparison of statistical methods for modeling count data with an application to hospital length of stay.一种用于对计数数据建模的统计方法比较及其在住院时间中的应用。
BMC Med Res Methodol. 2022 Aug 4;22(1):211. doi: 10.1186/s12874-022-01685-8.
8
Matching the Statistical Model to the Research Question for Dental Caries Indices with Many Zero Counts.针对具有大量零计数的龋齿指数,将统计模型与研究问题相匹配。
Caries Res. 2017;51(3):198-208. doi: 10.1159/000452675. Epub 2017 Mar 15.
9
A Bayesian model for repeated measures zero-inflated count data with application to outpatient psychiatric service use.一种用于重复测量零膨胀计数数据的贝叶斯模型及其在门诊精神科服务使用中的应用。
Stat Modelling. 2010 Dec;10(4):421-439. doi: 10.1177/1471082X0901000404.
10
Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial.零膨胀计数数据的负二项式和零膨胀负二项式模型评估:在医疗复杂性儿童远程医疗试验中的应用。
Trials. 2023 Sep 27;24(1):613. doi: 10.1186/s13063-023-07648-8.

引用本文的文献

1
Interactive Effects of Intersectional Minority Stress and Adaptive Coping on Intimate Partner Violence Perpetration in Cisgender Sexual and Racial Minoritized Adults: An I Model Analysis.交叉性少数群体压力与适应性应对对顺性别性取向和种族少数群体成年人亲密伴侣暴力行为的交互作用:I模型分析
Psychol Violence. 2025 May;15(3):374-384. doi: 10.1037/vio0000582. Epub 2024 Dec 30.
2
Pain when it "counts": hurdle analysis of clinical pain ratings improves data model performance.关键时的疼痛:临床疼痛评分的障碍分析可改善数据模型性能。
Pain Rep. 2025 Aug 7;10(5):e1322. doi: 10.1097/PR9.0000000000001322. eCollection 2025 Oct.
3

本文引用的文献

1
Zero-inflated models for adjusting varying exposures: a cautionary note on the pitfalls of using offset.用于调整不同暴露因素的零膨胀模型:关于使用偏移量陷阱的警示说明。
J Appl Stat. 2020 Jul 25;49(1):1-23. doi: 10.1080/02664763.2020.1796943. eCollection 2022.
2
Modeling socio-demographic and clinical factors influencing psychiatric inpatient service use: a comparison of models for zero-Inflated and overdispersed count data.建模影响精神科住院服务使用的社会人口学和临床因素:零膨胀和过离散计数数据模型的比较。
BMC Med Res Methodol. 2020 Sep 16;20(1):232. doi: 10.1186/s12874-020-01112-w.
3
A comparison of residual diagnosis tools for diagnosing regression models for count data.
Factors Associated With Employment and Quality of Working Life in Patients With Metastatic Breast Cancer.
转移性乳腺癌患者就业及工作生活质量的相关因素
Cancer Med. 2025 Aug;14(15):e71074. doi: 10.1002/cam4.71074.
4
[Precarious employment and functional limitations: a cross-sectional analysis for the Mexican population].[不稳定就业与功能受限:墨西哥人群的横断面分析]
Cad Saude Publica. 2025 Jul 4;41(6):e00102124. doi: 10.1590/0102-311XES102124. eCollection 2025.
5
Does change in area-level deprivation, change health outcomes? A latent class growth analysis of population data.地区层面的贫困变化会改变健康结果吗?一项基于人口数据的潜在类别增长分析。
SSM Popul Health. 2025 Jun 11;31:101826. doi: 10.1016/j.ssmph.2025.101826. eCollection 2025 Sep.
6
Western corn rootworm adult activity and immigrant resistance to Bt traits in first-year maize.西部玉米根萤叶甲成虫的活动及第一年种植的转基因抗虫玉米对其迁移种群的抗性
PLoS One. 2025 Jun 13;20(6):e0325388. doi: 10.1371/journal.pone.0325388. eCollection 2025.
7
A Comparison Between Markov Switching Zero-Inflated and Hurdle Models for Spatio-Temporal Infectious Disease Counts.用于时空传染病计数的马尔可夫切换零膨胀模型与障碍模型的比较
Stat Med. 2025 Jun;44(13-14):e70135. doi: 10.1002/sim.70135.
8
Zero-inflated models for the evaluation of colorectal polyps in colon cancer screening studies-a value-based biostatistics practice.用于结肠癌筛查研究中评估结直肠息肉的零膨胀模型——基于价值的生物统计学实践
PeerJ. 2025 May 26;13:e19504. doi: 10.7717/peerj.19504. eCollection 2025.
9
Examining the factors associated with disabilities among hypertensive patients in India.探究印度高血压患者中与残疾相关的因素。
Narra J. 2025 Apr;5(1):e1322. doi: 10.52225/narra.v5i1.1322. Epub 2025 Jan 24.
10
Variational inference for microbiome survey data with application to global ocean data.用于微生物群落调查数据的变分推断及其在全球海洋数据中的应用。
ISME Commun. 2025 May 2;5(1):ycaf062. doi: 10.1093/ismeco/ycaf062. eCollection 2025 Jan.
比较用于诊断计数数据回归模型的剩余诊断工具。
BMC Med Res Methodol. 2020 Jul 1;20(1):175. doi: 10.1186/s12874-020-01055-2.
4
Modeling zero-modified count and semicontinuous data in health services research Part 1: background and overview.卫生服务研究中零修正计数和半连续数据的建模 第1部分:背景与概述
Stat Med. 2016 Nov 30;35(27):5070-5093. doi: 10.1002/sim.7050. Epub 2016 Aug 8.
5
Assessment and Selection of Competing Models for Zero-Inflated Microbiome Data.零膨胀微生物组数据竞争模型的评估与选择
PLoS One. 2015 Jul 6;10(7):e0129606. doi: 10.1371/journal.pone.0129606. eCollection 2015.
6
Approaches for dealing with various sources of overdispersion in modeling count data: Scale adjustment versus modeling.处理计数数据建模中各种过度分散来源的方法:尺度调整与建模。
Stat Methods Med Res. 2017 Aug;26(4):1802-1823. doi: 10.1177/0962280215588569. Epub 2015 May 31.
7
A Spatial Poisson Hurdle Model for Exploring Geographic Variation in Emergency Department Visits.一种用于探索急诊科就诊地理差异的空间泊松障碍模型。
J R Stat Soc Ser A Stat Soc. 2013 Feb 1;176(2):389-413. doi: 10.1111/j.1467-985X.2012.01039.x. Epub 2012 Jun 28.
8
Statistical models for longitudinal zero-inflated count data with applications to the substance abuse field.具有应用于物质滥用领域的纵向零膨胀计数数据的统计模型。
Stat Med. 2012 Dec 20;31(29):4074-86. doi: 10.1002/sim.5510. Epub 2012 Jul 24.
9
Zero-inflated and hurdle models of count data with extra zeros: examples from an HIV-risk reduction intervention trial.带有额外零值的计数数据的零膨胀和障碍模型:来自 HIV 风险降低干预试验的实例。
Am J Drug Alcohol Abuse. 2011 Sep;37(5):367-75. doi: 10.3109/00952990.2011.597280.
10
Hidden Markov models for zero-inflated Poisson counts with an application to substance use.带有物质使用应用的零膨胀泊松计数的隐马尔可夫模型。
Stat Med. 2011 Jun 30;30(14):1678-94. doi: 10.1002/sim.4207. Epub 2011 May 2.