Bastos Fernando de Souza, Barreto-Souza Wagner
Instituto de Ciências Exatas e Tecnológicas, Universidade Federal de Viçosa - Campus UFV - Florestal, Florestal, Brazil.
Departamento de Estatística, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil.
J Appl Stat. 2020 Jun 14;48(11):1896-1916. doi: 10.1080/02664763.2020.1780570. eCollection 2021.
The sample selection bias problem occurs when the outcome of interest is only observed according to some selection rule, where there is a dependence structure between the outcome and the selection rule. In a pioneering work, J. Heckman proposed a sample selection model based on a bivariate normal distribution for dealing with this problem. Due to the non-robustness of the normal distribution, many alternatives have been introduced in the literature by assuming extensions of the normal distribution like the Student-t and skew-normal models. One common limitation of the existent sample selection models is that they require a transformation of the outcome of interest, which is common -valued, such as income and wage. With this, data are analyzed on a non-original scale which complicates the interpretation of the parameters. In this paper, we propose a sample selection model based on the bivariate Birnbaum-Saunders distribution, which has the same number of parameters that the classical Heckman model. Further, our associated outcome equation is -valued. We discuss estimation by maximum likelihood and present some Monte Carlo simulation studies. An empirical application to the ambulatory expenditures data from the 2001 Medical Expenditure Panel Survey is presented.
当仅根据某些选择规则观察感兴趣的结果时,就会出现样本选择偏差问题,其中结果与选择规则之间存在依赖结构。在一项开创性工作中,J. 赫克曼提出了一种基于二元正态分布的样本选择模型来处理这个问题。由于正态分布的非稳健性,文献中通过假设正态分布的扩展(如学生t分布和偏态正态模型)引入了许多替代方法。现有样本选择模型的一个常见局限性是,它们需要对感兴趣的结果进行变换,而这些结果通常是普通值,如收入和工资。这样一来,数据是在非原始尺度上进行分析的,这使得参数的解释变得复杂。在本文中,我们提出了一种基于二元Birnbaum-Saunders分布的样本选择模型,该模型具有与经典赫克曼模型相同数量的参数。此外,我们的关联结果方程是普通值的。我们讨论了最大似然估计,并给出了一些蒙特卡罗模拟研究。还给出了对2001年医疗支出小组调查的门诊支出数据的实证应用。