Sauzet Odile, Razum Oliver, Widera Teresia, Brzoska Patrick
Department of Epidemiology and International Public Health, School of Public Health, Bielefeld University, Bielefeld, Germany.
Centre for Statistics, Bielefeld University, Bielefeld, Germany.
Front Public Health. 2019 Jun 11;7:146. doi: 10.3389/fpubh.2019.00146. eCollection 2019.
Results of patient satisfaction questionnaires can contain a spike at the value corresponding to a complete satisfaction. A possible interpretation is that there are two types of respondents, those who are willing to provide a negative evaluation to one or more items proposed in the questionnaire and those who will always provide a completely positive evaluation irrespective of the item. The aim of the present study is to compare various statistical approaches to the analysis of such data using data from a rehabilitation patient survey of the German Statutory Pension Insurance Scheme as an example. We used data from 272,806 respondents who participated in the survey from 2008 to 2011. We illustrate four models: linear regression, logistic regression, a two-part model based on the assumption of two underlying populations and quantile regression, which does not require any distributional assumptions. For each model we consider the relationship of the satisfaction score with two covariates. While providing correct estimates of the mean values (marginal effects), the assumptions of the linear model are violated which can lead to false interpretations. A two-part regression which consists of a logistic regression followed by a linear regression conditional on not being fully satisfied is a useful alternative. For research questions focusing on specific parts of the distribution, logistic regression as well as quantile regression are to be considered. Data with a spike represents a statistical challenge but a range of modeling approaches is available to provide sound interpretations and correct answers to research questions.
患者满意度调查问卷的结果可能在对应完全满意的值处出现峰值。一种可能的解释是存在两种类型的受访者,一种是愿意对问卷中提出的一个或多个项目给出负面评价的人,另一种是无论项目如何都会始终给出完全正面评价的人。本研究的目的是以德国法定养老保险计划的康复患者调查数据为例,比较分析此类数据的各种统计方法。我们使用了2008年至2011年参与调查的272,806名受访者的数据。我们展示了四种模型:线性回归、逻辑回归、基于两个潜在总体假设的两部分模型以及分位数回归,分位数回归不需要任何分布假设。对于每个模型,我们考虑满意度得分与两个协变量之间的关系。虽然线性模型能提供均值(边际效应)的正确估计,但它的假设被违反了,这可能导致错误的解释。由逻辑回归和基于不完全满意条件下的线性回归组成的两部分回归是一种有用的替代方法。对于关注分布特定部分的研究问题,应考虑逻辑回归以及分位数回归。带有峰值的数据代表了一个统计挑战,但有一系列建模方法可用于提供合理的解释并正确回答研究问题。