Angel Stefan, Heuberger Richard, Lamei Nadja
1Institute for Social Policy, WU Vienna University of Economics and Business, Welthandelsplatz 1, 1020 Vienna, Austria.
2Statistics Austria, Guglgasse 13, 1110 Vienna, Austria.
Soc Indic Res. 2018;138(2):575-603. doi: 10.1007/s11205-017-1672-7. Epub 2017 Jun 12.
We take advantage of the fact that for the Austrian SILC 2008-2011, two data sources are available in parallel for the same households: register-based and survey-based income data. Thus, we aim to explain which households tend to under- or over-report their household income by estimating multinomial logit and OLS models with covariates referring to the interview situation, employment status and socio-demographic household characteristics. Furthermore, we analyze source-specific differences in the distribution of household income and how these differences affect aggregate poverty indicators based on household income. The analysis reveals an increase in the cross-sectional poverty rates for 2008-2011 and the longitudinal poverty rate if register data rather than survey data are used. These changes in the poverty rate are mainly driven by differences in employment income rather than sampling weights and other income components. Regression results show a pattern of mean-reverting errors when comparing household income between the two data sources. Furthermore, differences between data sources for both under-reporting and over-reporting slightly decrease with the number of panel waves in which a household participated. Among the other variables analyzed that are related to the interview situation (mode, proxy, interview month), only the number of proxy interviews was (weakly) positively correlated with the difference between data sources, although this outcome was not robust over different model specifications.
对于奥地利2008 - 2011年的社会经济面板调查(SILC),同一家庭同时有两个并行的数据源可用,即基于登记册的收入数据和基于调查的收入数据。因此,我们旨在通过估计多项logit模型和OLS模型来解释哪些家庭倾向于少报或多报其家庭收入,这些模型的协变量涉及访谈情况、就业状况和家庭社会人口特征。此外,我们分析了家庭收入分布中特定来源的差异,以及这些差异如何影响基于家庭收入的总体贫困指标。分析表明,如果使用登记数据而非调查数据,2008 - 2011年的横截面贫困率和纵向贫困率都会上升。贫困率的这些变化主要是由就业收入差异而非抽样权重和其他收入成分驱动的。回归结果显示,在比较两个数据源的家庭收入时,存在均值回归误差模式。此外,随着家庭参与的面板波数量增加,少报和多报的数据源之间的差异略有减小。在分析的与访谈情况相关的其他变量(方式、代理人、访谈月份)中,只有代理人访谈的数量与数据源之间的差异呈(弱)正相关,尽管这一结果在不同的模型设定下并不稳健。