University of Notre Dame, 4060 Jenkins Nanovic Halls, Department of Sociology, University of Notre Dame, Notre Dame, IN, 46556, USA.
Soc Sci Res. 2023 Jan;109:102802. doi: 10.1016/j.ssresearch.2022.102802. Epub 2022 Nov 3.
Social scientists are often interested in seeing how the estimated effects of variables change once other variables are controlled for. For example, a simple analysis may reveal that income differs by race - but why does it differ? To answer such a question, a researcher might estimate a model where race is the only independent variable, and then add variables such as education to subsequent models. If the original estimated effect of race declines, this may be because race affects education, which in turn affects income. What is not universally realized is that the interpretation of such nested models can be problematic when logit or probit techniques are employed with binary dependent variables. Naïve comparisons of coefficients between models can indicate differences where none exist, hide differences that do exist, and even show differences in the opposite direction of what actually exists. We discuss why problems occur and illustrate their potential consequences. Proposed solutions, such as Linear Probability Models, Y-standardization, the Karlson/Holm/Breen method, and marginal effects, are explained and evaluated.
社会科学家通常感兴趣的是,一旦控制了其他变量,变量的估计效果会如何变化。例如,一个简单的分析可能会揭示收入因种族而异——但为什么会有这种差异呢?为了回答这样的问题,研究人员可能会估计一个只有种族作为自变量的模型,然后在后续的模型中添加教育等变量。如果种族的原始估计效应下降,这可能是因为种族影响教育,而教育又影响收入。人们没有普遍认识到的是,当使用二项因变量的逻辑回归或概率回归技术时,这种嵌套模型的解释可能会有问题。在模型之间进行简单的系数比较可能会表明不存在差异的地方存在差异,掩盖确实存在的差异,甚至显示与实际存在的差异相反的差异。我们讨论了为什么会出现问题,并说明了它们潜在的后果。我们解释和评估了一些建议的解决方案,如线性概率模型、Y 标准化、Karlson/Holm/Breen 方法和边际效应。