Suppr超能文献

挪威全科医生基于网络的公众评级:验证研究

Web-Based Public Ratings of General Practitioners in Norway: Validation Study.

作者信息

Bjertnæs Øyvind, Iversen Hilde Hestad, Norman Rebecka, Valderas Jose M

机构信息

Norwegian Institute of Public Health, Oslo, Norway.

Department of Family Medicine, National University Health System, Singapore, Singapore.

出版信息

JMIR Form Res. 2023 Mar 17;7:e38932. doi: 10.2196/38932.

Abstract

BACKGROUND

Understanding the complex relationships among multiple strategies for gathering users' perspectives in the evaluation of the performance of services is crucial for the interpretation of user-reported measures.

OBJECTIVE

The main objectives were to (1) evaluate the psychometric performance of an 11-item web-based questionnaire of ratings of general practitioners (GPs) currently used in Norway (Legelisten.no) and (2) assess the association between web-based and survey-based patient experience indicators.

METHODS

We included all published ratings on GPs and practices on Legelisten.no in the period of May 5, 2012, to December 15, 2021 (N=76,521). The questionnaire consists of 1 mandatory item and 10 voluntary items with 5 response categories (1 to 5 stars), alongside an open-ended review question and background variables. Questionnaire dimensionality and internal consistency were assessed with Cronbach α, exploratory factor, and item response theory analyses, and a priori hypotheses were developed for assessing construct validity (chi-square analysis). We calculated Spearman correlations between web-based ratings and reference patient experience indicators based on survey data using the patient experiences with the GP questionnaire (n=5623 respondents for a random sample of 50 GPs).

RESULTS

Web-based raters were predominantly women (n=32,074, 64.0%), in the age range of 20-50 years (n=35,113, 74.6%), and reporting 5 or fewer consultations with the GP each year (n=28,798, 64.5%). Ratings were missing for 18.9% (n=14,500) to 27.4% (n=20,960) of nonmandatory items. A total of 4 of 11 rating items showed a U-shaped distribution, with >60% reporting 5 stars. Factor analysis and internal consistency testing identified 2 rating scales: "GP" (5 items; α=.98) and "practice" (6 items; α=.85). Some associations were not consistent with a priori hypotheses and allowed only partial confirmation of the construct validity of ratings. Item response theory analysis results were adequate for the "practice" scale but not for the "GP" scale, with items with inflated discrimination (>5) distributed over a narrow interval of the scale. The correlations between the web-based ratings GP scale and GP reference indicators ranged from 0.34 (P=.021) to 0.44 (P=.002), while the correlation between the web-based ratings practice scale and reference indicators ranged from 0.17 (not significant) to 0.49 (P<.001). The strongest correlations between web-based and survey scores were found for items measuring practice-related experiences: phone availability (ρ=0.51), waiting time in the office (ρ=0.62), other staff (ρ=0.54-0.58; P<.001).

CONCLUSIONS

The practice scale of the web-based ratings has adequate psychometric performance, while the GP suffers from important limitations. The associations with survey-based patient experience indicators were accordingly mostly weak to modest. Our study underlines the importance of interpreting web-based ratings with caution and the need to further develop rating sites.

摘要

背景

在服务绩效评估中,理解收集用户观点的多种策略之间的复杂关系对于解读用户报告的指标至关重要。

目的

主要目标是:(1)评估挪威目前使用的基于网络的11项全科医生(GP)评分问卷(Legelisten.no)的心理测量性能;(2)评估基于网络的和基于调查的患者体验指标之间的关联。

方法

我们纳入了2012年5月5日至2021年12月15日期间Legelisten.no上公布的所有关于全科医生和诊所的评分(N = 76,521)。该问卷包括1个必填项和10个选填项,有5个回答类别(1至5星),以及一个开放式评论问题和背景变量。使用克朗巴哈α系数、探索性因素分析和项目反应理论分析评估问卷的维度和内部一致性,并为评估结构效度制定了先验假设(卡方分析)。我们使用全科医生问卷的患者体验数据(对50名全科医生的随机样本进行调查,n = 5623名受访者)计算基于网络的评分与参考患者体验指标之间的斯皮尔曼相关性。

结果

基于网络的评分者主要为女性(n = 32,074,64.0%),年龄在20 - 50岁之间(n = 35,113,74.6%),且报告每年与全科医生的咨询次数为5次或更少(n = 28,798,64.5%)。非必填项的评分缺失率为18.9%(n = 14,500)至27.4%(n = 20,960)。11个评分项目中共有4个呈现U形分布,超过60%的人给出5星评分。因素分析和内部一致性测试确定了2个评分量表:“全科医生”(5个项目;α = 0.98)和“诊所”(6个项目;α = 0.85)。一些关联与先验假设不一致,仅部分证实了评分的结构效度。项目反应理论分析结果对于“诊所”量表是合适的,但对于“全科医生”量表不合适,具有过高区分度(>5)的项目分布在量表的一个狭窄区间内。基于网络的评分全科医生量表与全科医生参考指标之间的相关性范围为0.34(P = 0.021)至0.44(P = 0.002),而基于网络的评分诊所量表与参考指标之间的相关性范围为0.17(无显著性)至0.49(P < 0.001)。基于网络的评分与调查分数之间最强的相关性出现在测量与诊所相关体验的项目上:电话可用性(ρ = 0.51)、在诊所的等待时间(ρ = 0.62)、其他工作人员(ρ = 0.54 - 0.58;P < 0.001)。

结论

基于网络的评分的诊所量表具有足够的心理测量性能,而全科医生量表存在重要局限性。因此,与基于调查的患者体验指标的关联大多较弱至中等。我们的研究强调了谨慎解读基于网络的评分的重要性以及进一步开发评分网站的必要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04f5/10131642/c689eb9a6736/formative_v7i1e38932_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验