• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

零值过多和/或高度偏态?关于将健康行为建模为使用泊松回归和负二项式回归的计数数据的教程。

Too many zeros and/or highly skewed? A tutorial on modelling health behaviour as count data with Poisson and negative binomial regression.

作者信息

Green James A

机构信息

School of Allied Health, University of Limerick, Limerick, Ireland.

Physical Activity for Health Research Cluster (Health Research Institute), University of Limerick, Limerick, Ireland.

出版信息

Health Psychol Behav Med. 2021 May 6;9(1):436-455. doi: 10.1080/21642850.2021.1920416.

DOI:10.1080/21642850.2021.1920416
PMID:34104569
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8159206/
Abstract

Dependent variables in health psychology are often counts, for example, of a behaviour or number of engagements with an intervention. These counts can be very strongly skewed, and/or contain large numbers of zeros as well as extreme outliers. For example, 'How many cigarettes do you smoke on an average day?' The modal answer may be zero but may range from 0 to 40+. The same can be true for minutes of moderate-to-vigorous physical activity. For some people, this may be near zero, but take on extreme values for someone training for a marathon. Typical analytical strategies for this data involve explicit (or implied) transformations (smoker v. non-smoker, log transformations). However, these data types are 'counts' (i.e. non-negative whole numbers) or quasi-counts (time is ratio but discrete minutes of activity could be analysed as a count), and can be modelled using count distributions - including the Poisson and negative binomial distribution (and their zero-inflated and hurdle extensions, which alloweven more zeros). In this tutorial paper I demonstrate (in R, Jamovi, and SPSS) the easy application of these models to health psychology data, and their advantages over alternative ways of analysing this type of data using two datasets - one highly dispersed dependent variable (number of views on YouTube, and another with a large number of zeros (number of days on which symptoms were reported over a month). The negative binomial distribution had the best fit for the overdispersed number of views on YouTube. Negative binomial, and zero-inflated negative binomial were both good fits for the symptom data with over-abundant zeros. In both cases, count distributions provided not just a better fit but would lead to different conclusions compared to the poorly fitting traditional regression/linear models.

摘要

健康心理学中的因变量通常是计数,例如某种行为的计数或参与某项干预的次数。这些计数可能严重偏态,和/或包含大量零值以及极端异常值。例如,“你平均每天吸多少支烟?”典型答案可能是零,但范围可能从0到40多支。中度至剧烈身体活动的分钟数情况也一样。对一些人来说,这个数字可能接近零,但对于正在为马拉松训练的人来说可能会有极端值。针对这类数据的典型分析策略涉及显式(或隐含)变换(吸烟者与非吸烟者,对数变换)。然而,这些数据类型是“计数”(即非负整数)或准计数(时间是比率变量,但离散的活动分钟数可以作为计数来分析),并且可以使用计数分布进行建模——包括泊松分布和负二项分布(以及它们的零膨胀和障碍扩展,这允许出现更多零值)。在本教程论文中,我展示了(在R、Jamovi和SPSS中)这些模型在健康心理学数据中的轻松应用,以及与使用两个数据集分析这类数据的其他方法相比它们的优势——一个是高度分散的因变量(YouTube上的观看次数),另一个有大量零值(一个月内报告症状的天数)。负二项分布对YouTube上过度分散的观看次数拟合最佳。负二项分布和零膨胀负二项分布对有大量零值的症状数据拟合都很好。在这两种情况下,计数分布不仅提供了更好的拟合,而且与拟合不佳的传统回归/线性模型相比会得出不同的结论。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/e9228a215f6f/RHPB_A_1920416_F0013_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/f99269d4b3d2/RHPB_A_1920416_F0001_OB.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/5bfeaf144f1a/RHPB_A_1920416_F0002_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/7373fb873b9d/RHPB_A_1920416_F0003_OB.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/4629da536651/RHPB_A_1920416_F0004_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/3bc59ed7cfdf/RHPB_A_1920416_F0005_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/47b084e9b461/RHPB_A_1920416_F0006_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/17dfcd67b62e/RHPB_A_1920416_F0007_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/0f33e914f944/RHPB_A_1920416_F0008_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/aa9254dbaba4/RHPB_A_1920416_F0009_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/b071a046c84e/RHPB_A_1920416_F0010_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/181360874092/RHPB_A_1920416_F0011_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/ad1860abdcea/RHPB_A_1920416_F0012_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/e9228a215f6f/RHPB_A_1920416_F0013_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/f99269d4b3d2/RHPB_A_1920416_F0001_OB.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/5bfeaf144f1a/RHPB_A_1920416_F0002_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/7373fb873b9d/RHPB_A_1920416_F0003_OB.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/4629da536651/RHPB_A_1920416_F0004_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/3bc59ed7cfdf/RHPB_A_1920416_F0005_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/47b084e9b461/RHPB_A_1920416_F0006_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/17dfcd67b62e/RHPB_A_1920416_F0007_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/0f33e914f944/RHPB_A_1920416_F0008_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/aa9254dbaba4/RHPB_A_1920416_F0009_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/b071a046c84e/RHPB_A_1920416_F0010_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/181360874092/RHPB_A_1920416_F0011_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/ad1860abdcea/RHPB_A_1920416_F0012_OC.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3474/8159206/e9228a215f6f/RHPB_A_1920416_F0013_OC.jpg

相似文献

1
Too many zeros and/or highly skewed? A tutorial on modelling health behaviour as count data with Poisson and negative binomial regression.零值过多和/或高度偏态?关于将健康行为建模为使用泊松回归和负二项式回归的计数数据的教程。
Health Psychol Behav Med. 2021 May 6;9(1):436-455. doi: 10.1080/21642850.2021.1920416.
2
A comparison of statistical methods for modeling count data with an application to hospital length of stay.一种用于对计数数据建模的统计方法比较及其在住院时间中的应用。
BMC Med Res Methodol. 2022 Aug 4;22(1):211. doi: 10.1186/s12874-022-01685-8.
3
Models for analyzing zero-inflated and overdispersed count data: an application to cigarette and marijuana use.用于分析零膨胀和过度分散计数数据的模型:在香烟和大麻使用中的应用。
Nicotine Tob Res. 2018 Apr 18;22(8):1390-8. doi: 10.1093/ntr/nty072.
4
Using zero-inflated and hurdle regression models to analyze schistosomiasis data of school children in the southern areas of Ghana.利用零膨胀和障碍回归模型分析加纳南部地区学龄儿童的血吸虫病数据。
PLoS One. 2024 Jul 12;19(7):e0304681. doi: 10.1371/journal.pone.0304681. eCollection 2024.
5
Count data models for outpatient health services utilisation.门诊卫生服务利用的计数数据模型。
BMC Med Res Methodol. 2022 Oct 5;22(1):261. doi: 10.1186/s12874-022-01733-3.
6
Zero adjusted models with applications to analysing helminths count data.适用于分析蠕虫计数数据的零调整模型。
BMC Res Notes. 2014 Nov 27;7:856. doi: 10.1186/1756-0500-7-856.
7
Count data distributions and their zero-modified equivalents as a framework for modelling microbial data with a relatively high occurrence of zero counts.计数数据分布及其零修正等效物作为一个框架,用于对具有相对较高零计数发生率的微生物数据进行建模。
Int J Food Microbiol. 2010 Jan 1;136(3):268-77. doi: 10.1016/j.ijfoodmicro.2009.10.016. Epub 2009 Oct 28.
8
Quasi-binomial zero-inflated regression model suitable for variables with bounded support.适用于具有有界支持变量的拟二项零膨胀回归模型。
J Appl Stat. 2019 Dec 26;47(12):2208-2229. doi: 10.1080/02664763.2019.1707517. eCollection 2020.
9
A demonstration of modeling count data with an application to physical activity.计数数据建模的演示及其在身体活动中的应用
Epidemiol Perspect Innov. 2006 Mar 21;3:3. doi: 10.1186/1742-5573-3-3.
10
Comparison of different statistical models for the analysis of fracture events: findings from the Prevention of Falls Injury Trial (PreFIT).不同统计模型在骨折事件分析中的比较:来自防跌倒伤害试验(PreFIT)的研究结果。
BMC Med Res Methodol. 2023 Oct 2;23(1):216. doi: 10.1186/s12874-023-02040-1.

引用本文的文献

1
Comparative Effectiveness of Valoctocogene Roxaparvovec and Efanesoctocog Alfa in the Treatment of Severe Hemophilia A: A Matching-Adjusted Indirect Comparison of Bleeding Frequency.Valoctocogene Roxaparvovec与Efanesoctocog Alfa治疗重度A型血友病的比较疗效:出血频率的匹配调整间接比较
Adv Ther. 2025 Sep 9. doi: 10.1007/s12325-025-03339-9.
2
A Digital Therapeutic Intervention for Inpatients With Elevated Suicide Risk: A Randomized Clinical Trial.一项针对自杀风险升高的住院患者的数字治疗干预:一项随机临床试验。
JAMA Netw Open. 2025 Aug 1;8(8):e2525809. doi: 10.1001/jamanetworkopen.2025.25809.
3
PTSD course and predictors in a 15 year longitudinal cohort following suspected serious injury.

本文引用的文献

1
Studying Behaviour Change Mechanisms under Complexity.研究复杂情况下的行为改变机制。
Behav Sci (Basel). 2021 May 14;11(5):77. doi: 10.3390/bs11050077.
2
Complexity in psychological self-ratings: implications for research and practice.心理自评的复杂性:对研究和实践的启示。
BMC Med. 2020 Oct 8;18(1):317. doi: 10.1186/s12916-020-01727-2.
3
An experience sampling study of organizational stress processes and future playing time in professional sport.一项组织压力过程和专业运动未来打球时间的体验抽样研究。
疑似重伤后15年纵向队列中的创伤后应激障碍病程及预测因素
Npj Ment Health Res. 2025 Aug 7;4(1):35. doi: 10.1038/s44184-025-00153-7.
4
Two sides of the same coin: recruitment performance and perceived workload in primary care trials-insights from the AgeWell.de study.同一枚硬币的两面:初级保健试验中的招募表现与感知工作量——来自AgeWell.de研究的见解
BMC Prim Care. 2025 Aug 5;26(1):243. doi: 10.1186/s12875-025-02948-1.
5
Negative binomial mixed effects location-scale models for intensive longitudinal count-type physical activity data provided by wearable devices.用于可穿戴设备提供的密集纵向计数型身体活动数据的负二项混合效应位置尺度模型。
Biometrics. 2025 Jul 3;81(3). doi: 10.1093/biomtc/ujaf099.
6
Is membership in microfinance initiatives associated with viral load suppression among HIV patients? Evidence from western Kenya.小额信贷项目的参与情况与HIV患者的病毒载量抑制有关吗?来自肯尼亚西部的证据。
BMC Glob Public Health. 2025 Jun 25;3(1):55. doi: 10.1186/s44263-025-00170-w.
7
Pediatric Intensive Care Nurse Staffing Measures and Patient Outcomes During the COVID-19 Pandemic.新冠疫情期间儿科重症监护病房护士人员配置措施与患者预后
JAMA Netw Open. 2025 Jun 2;8(6):e2515376. doi: 10.1001/jamanetworkopen.2025.15376.
8
Factors affecting out-of-pocket expenditures for chronic and acute illnesses in Bangladesh.影响孟加拉国慢性病和急性病自付费用的因素。
PLoS One. 2025 Apr 9;20(4):e0320429. doi: 10.1371/journal.pone.0320429. eCollection 2025.
9
Modelling Count Data in Psychological Research: An Applied Tutorial.心理学研究中的计数数据建模:应用教程
Int J Psychol. 2025 Apr;60(2):e70018. doi: 10.1002/ijop.70018.
10
Self-reported illnesses in Thatta: Evidence from a rural and underdeveloped district in Sindh province, Pakistan.塔塔地区的自我报告疾病:来自巴基斯坦信德省一个农村且欠发达地区的证据。
PLoS One. 2025 Jan 31;20(1):e0293790. doi: 10.1371/journal.pone.0293790. eCollection 2025.
J Sports Sci. 2020 Mar;38(5):559-567. doi: 10.1080/02640414.2020.1717302. Epub 2020 Jan 28.
4
Invisible Social Support and Invisible Social Control in Dual-smoker Couple's Everyday Life: A Dyadic Perspective.双重吸烟夫妇日常生活中的隐性社会支持和隐性社会控制:一种对偶视角。
Ann Behav Med. 2019 May 3;53(6):527-540. doi: 10.1093/abm/kay062.
5
Prospective daily diary study reporting of any and all symptoms in healthy adults in Pakistan: prevalence and response.巴基斯坦健康成年人所有症状的前瞻性每日日记研究报告:患病率及反应情况
BMJ Open. 2017 Nov 14;7(11):e014998. doi: 10.1136/bmjopen-2016-014998.
6
Mind the Gap? An Intensive Longitudinal Study of Between-Person and Within-Person Intention-Behavior Relations.注意差距?一项关于个体间和个体内意图-行为关系的密集纵向研究。
Ann Behav Med. 2016 Aug;50(4):516-22. doi: 10.1007/s12160-016-9776-x.
7
Online Curves: A Quality Analysis of Scoliosis Videos on YouTube.在线曲线:YouTube上脊柱侧弯视频的质量分析
Spine (Phila Pa 1976). 2015 Dec;40(23):1857-61. doi: 10.1097/BRS.0000000000001137.
8
The analysis of incontinence episodes and other count data in patients with overactive bladder by Poisson and negative binomial regression.通过泊松回归和负二项回归分析膀胱过度活动症患者的尿失禁发作及其他计数数据。
Pharm Stat. 2015 Mar-Apr;14(2):151-60. doi: 10.1002/pst.1664. Epub 2014 Dec 18.
9
Predicting length of stay from an electronic patient record system: a primary total knee replacement example.从电子病历系统预测住院时间:以初次全膝关节置换为例。
BMC Med Inform Decis Mak. 2014 Apr 4;14:26. doi: 10.1186/1472-6947-14-26.
10
Reexamining the association between smoking and periodontitis in the dunedin study with an enhanced analytical approach.在达尼丁研究中采用强化分析方法重新审视吸烟与牙周炎之间的关联。
J Periodontol. 2014 Oct;85(10):1390-7. doi: 10.1902/jop.2014.130577. Epub 2014 Feb 20.