Greene William, Harris Mark N, Srivastava Preety, Zhao Xueyan
Stern School of Business, New York University, New York, NY, USA.
Curtin Business School, Curtin University, Perth, WA, Australia.
Health Econ. 2018 Feb;27(2):372-389. doi: 10.1002/hec.3553. Epub 2017 Aug 4.
When modelling "social bads," such as illegal drug consumption, researchers are often faced with a dependent variable characterised by a large number of zero observations. Building on the recent literature on hurdle and double-hurdle models, we propose a double-inflated modelling framework, where the zero observations are allowed to come from the following: nonparticipants; participant misreporters (who have larger loss functions associated with a truthful response); and infrequent consumers. Due to our empirical application, the model is derived for the case of an ordered discrete-dependent variable. However, it is similarly possible to augment other such zero-inflated models (e.g., zero-inflated count models, and double-hurdle models for continuous variables). The model is then applied to a consumer choice problem of cannabis consumption. We estimate that 17% of the reported zeros in the cannabis survey are from individuals who misreport their participation, 11% from infrequent users, and only 72% from true nonparticipants.
在对诸如非法药物消费等“社会不良行为”进行建模时,研究人员常常面临一个具有大量零观测值的因变量。基于近期关于门槛模型和双重门槛模型的文献,我们提出了一个双重膨胀建模框架,其中零观测值可能来自以下几种情况:非参与者;参与者误报者(他们因如实回答而具有更大的损失函数);以及不常消费者。由于我们的实证应用,该模型是针对有序离散因变量的情况推导出来的。然而,同样有可能对其他此类零膨胀模型(例如,零膨胀计数模型以及连续变量的双重门槛模型)进行扩展。然后将该模型应用于大麻消费的消费者选择问题。我们估计,大麻调查中报告的零值有17%来自误报参与情况的个体,11%来自不常使用者,而仅有72%来自真正的非参与者。