计数结局中过度零的统计模型性能的模拟研究。

A simulation study of the performance of statistical models for count outcomes with excessive zeros.

机构信息

Department of Population and Community Health, University of North Texas Health Science Center, Fort Worth, Texas, USA.

Norden Lofts, White Plains, New York, USA.

出版信息

Stat Med. 2024 Oct 30;43(24):4752-4767. doi: 10.1002/sim.10198. Epub 2024 Aug 28.

DOI:10.1002/sim.10198

PMID:39193779

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11483204/

Abstract

BACKGROUND

Outcome measures that are count variables with excessive zeros are common in health behaviors research. Examples include the number of standard drinks consumed or alcohol-related problems experienced over time. There is a lack of empirical data about the relative performance of prevailing statistical models for assessing the efficacy of interventions when outcomes are zero-inflated, particularly compared with recently developed marginalized count regression approaches for such data.

METHODS

The current simulation study examined five commonly used approaches for analyzing count outcomes, including two linear models (with outcomes on raw and log-transformed scales, respectively) and three prevailing count distribution-based models (ie, Poisson, negative binomial, and zero-inflated Poisson (ZIP) models). We also considered the marginalized zero-inflated Poisson (MZIP) model, a novel alternative that estimates the overall effects on the population mean while adjusting for zero-inflation. Motivated by alcohol misuse prevention trials, extensive simulations were conducted to evaluate and compare the statistical power and Type I error rate of the statistical models and approaches across data conditions that varied in sample size ( to 500), zero rate (0.2 to 0.8), and intervention effect sizes.

RESULTS

Under zero-inflation, the Poisson model failed to control the Type I error rate, resulting in higher than expected false positive results. When the intervention effects on the zero (vs. non-zero) and count parts were in the same direction, the MZIP model had the highest statistical power, followed by the linear model with outcomes on the raw scale, negative binomial model, and ZIP model. The performance of the linear model with a log-transformed outcome variable was unsatisfactory.

CONCLUSIONS

The MZIP model demonstrated better statistical properties in detecting true intervention effects and controlling false positive results for zero-inflated count outcomes. This MZIP model may serve as an appealing analytical approach to evaluating overall intervention effects in studies with count outcomes marked by excessive zeros.

摘要

背景

在健康行为研究中，结果测量是具有过多零值的计数变量是很常见的。例如，随着时间的推移，消耗的标准饮料数量或经历的与酒精相关的问题。当结果为零膨胀时，缺乏关于流行的统计模型评估干预效果的相对性能的经验数据，特别是与最近为这种数据开发的边缘化计数回归方法相比。

方法

本研究通过模拟研究考察了分析计数结果的五种常用方法，包括两种线性模型（分别基于原始和对数转换的结果）和三种基于流行的计数分布的模型（即泊松、负二项和零膨胀泊松（ZIP）模型）。我们还考虑了边缘化零膨胀泊松（MZIP）模型，这是一种新颖的替代方法，可在调整零膨胀的同时估计对总体平均值的总体影响。受酒精滥用预防试验的启发，进行了广泛的模拟，以评估和比较在不同数据条件下，各种统计模型和方法的统计功效和 I 型错误率，数据条件包括样本量（500 至 500）、零率（0.2 至 0.8）和干预效果大小。

结果

在零膨胀下，泊松模型未能控制 I 型错误率，导致高于预期的假阳性结果。当干预对零（与非零）和计数部分的影响方向相同时，MZIP 模型具有最高的统计功效，其次是原始尺度结果的线性模型、负二项式模型和 ZIP 模型。对数转换结果变量的线性模型表现不佳。

结论

MZIP 模型在检测真实干预效果和控制零膨胀计数结果的假阳性结果方面表现出更好的统计特性。对于计数结果标记有过多零的研究，MZIP 模型可能是一种有吸引力的分析方法，可以评估总体干预效果。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

计数结局中过度零的统计模型性能的模拟研究。

A simulation study of the performance of statistical models for count outcomes with excessive zeros.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

计数结局中过度零的统计模型性能的模拟研究。

A simulation study of the performance of statistical models for count outcomes with excessive zeros.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献