Farewell V T, Long D L, Tom B D M, Yiu S, Su L
Medical Research Council Biostatistics Unit, Institute of Public Health, University of Cambridge, Cambridge CB2 0SR, United Kingdom.
Department of Biostatistics, West Virginia University, Morgantown, West Virginia 26506.
Annu Rev Stat Appl. 2017 Mar;4:283-315. doi: 10.1146/annurev-statistics-060116-054131.
Statistical models that involve a two-part mixture distribution are applicable in a variety of situations. Frequently, the two parts are a model for the binary response variable and a model for the outcome variable that is conditioned on the binary response. Two common examples are zero-inflated or hurdle models for count data and two-part models for semicontinuous data. Recently, there has been particular interest in the use of these models for the analysis of repeated measures of an outcome variable over time. The aim of this review is to consider motivations for the use of such models in this context and to highlight the central issues that arise with their use. We examine two-part models for semicontinuous and zero-heavy count data, and we also consider models for count data with a two-part random effects distribution.
涉及两部分混合分布的统计模型适用于多种情况。通常,这两部分分别是二元响应变量的模型和基于二元响应的结果变量的模型。两个常见的例子是计数数据的零膨胀或门槛模型以及半连续数据的两部分模型。最近,人们特别关注使用这些模型来分析结果变量随时间的重复测量。本综述的目的是考虑在这种情况下使用此类模型的动机,并突出使用它们时出现的核心问题。我们研究了半连续和零重计数数据的两部分模型,还考虑了具有两部分随机效应分布的计数数据模型。