Brinkman Niels, Looman Rick, Jayakumar Prakash, Ring David, Choi Seung
Department of Surgery and Perioperative Care, Dell Medical School, The University of Texas at Austin, Austin, TX, USA.
The Center for Applied Psychometric Research, Educational Psychology Department, The University of Texas at Austin, Austin, TX, USA.
Clin Orthop Relat Res. 2025 Apr 1;483(4):693-703. doi: 10.1097/CORR.0000000000003262. Epub 2024 Oct 25.
Patient-reported experience measures (PREMs), such as the Jefferson Scale of Patient Perceptions of Physician Empathy (JSPPPE) or the Wake Forest Trust in Physician Scale (WTPS), have notable intercorrelation and ceiling effects (the proportion of observations with the highest possible score). Information is lost when high ceiling effects occur as there almost certainly is at least some variation among the patients with the highest score that the measurement tool was unable to measure. Efforts to identify and quantify factors associated with diminished patient experience can benefit from a PREM with more variability and a smaller proportion of highest possible scores (that is, a more limited ceiling effect) than occurs with currently available PREMs.
QUESTIONS/PURPOSES: In the first stage of a two-stage process, using a cohort of patients seeking musculoskeletal specialty care, we asked: (1) What groupings of items that address a similar aspect of patient experience are present among binary items directed at patient experience and derived from commonly used PREMs? (2) Can a small number of representative items provide a measure with potential for less of a ceiling effect (high item difficulty parameters)? In a second, independent cohort enrolled to assess whether the identified items perform consistently among different cohorts, we asked: (3) Does the new PREM perform differently in terms of item groupings (factor structure), and would different subsets of the included items provide the same measurement results (internal consistency) when items are measured using a 5-point rating scale instead of a binary scale? (4) What are the differences in survey properties (for example, ceiling effects) and correlation between the new PREM and commonly used PREMs?
In two cross-sectional studies among patients seeking musculoskeletal specialty care conducted in 2022 and 2023, all English-speaking and English-reading adults (ages 18 to 89 years) without cognitive deficiency were invited to participate in two consecutive, separate cohorts to help develop (the initial, learning cohort) and internally validate (the second, validation cohort) a provisional new PREM. We identified 218 eligible patients for the initial learning cohort, of whom all completed all measures. Participants had a mean ± SD age of 55 ± 16 years, 60% (130) were women, 45% (99) had private insurance, and most sought care for lower extremity (56% [121]) and nontraumatic conditions (63% [137]). We measured 25 items derived from other commonly used PREMs that address aspects of patient experience in which patients reported whether they agreed or disagreed (binary) with certain statements about their clinician. We performed an exploratory factor analysis and confirmatory factor analysis (CFA) to identify groups of items that measure the same underlying construct related to patient experience. We then applied a two-parameter logistic model based on item response theory to identify the most discriminating items with the most variability (item difficulty) with the aim of reducing the ceiling effect. We also conducted a differential item functioning analysis to assess whether specific items are rated discordantly by specific subgroups of patients, which can introduce bias. We then enrolled 154 eligible patients, of whom 99% (153) completed all required measures, into a validation cohort with similar demographic characteristics. We changed the binary items to 5-point Likert scales to increase the potential for variation in an attempt to further reduce ceiling effects and repeated the CFA. We also measured internal consistency (using Cronbach alpha) and the correlation of the new PREM with other commonly used PREMs using bivariate analyses.
We identified three groupings of items in the learning cohort representing "trust in clinician" (13 items), "relationship with clinician" (7 items), and "participation in shared decision-making" (4 items). The "trust in clinician" factor performed best of all three factors and therefore was selected for subsequent analyses. We selected the best-performing items in terms of item difficulty to generate a 7-item short form. We found excellent CFA model fit (the 13-item and 7-item versions both had a root mean square error of approximation [RMSEA] of < 0.001), excellent internal consistency (Cronbach α was 0.94 for the 13-item version and 0.91 for the 7-item version), good item response theory parameters (item difficulty ranging between -0.37 and 0.16 for the 7-item version, with higher values indicating lower ceiling effect), no local dependencies, and no differential item functioning among any of the items. The other two factors were excluded from measure development due to low item response theory parameters (item difficulty ranging between -1.3 and -0.69, indicating higher ceiling effect), multiple local dependencies, and exhausting the number of items without being able to address these issues. The validation cohort confirmed adequate item selection and performance of both the 13-item and 7-item version of the Trust and Experience with Clinicians Scale (TRECS), with good to excellent CFA model fit (RMSEA 0.058 [TRECS-13]; RMSEA 0.016 [TRECS-7]), excellent internal consistency (Cronbach α = 0.96 [TRECS-13]; Cronbach α = 0.92 [TRECS-7]), no differential item functioning and limited ceiling effects (11% [TRECS-13]; 14% [TRECS-7]), and notable correlation with other PREMs such as the JSPPPE (ρ = 0.77) and WTPS (ρ = 0.74).
A relatively brief 7-item measure of patient experience focused on trust can eliminate most of the ceiling effects common to PREMs with good psychometric properties. Future studies may externally validate the TRECS in other populations as well as provide population-based T-score conversion tables based on a larger sample size more representative of the population seeking musculoskeletal care.
A PREM anchored in trust that reduces loss of information at the higher end of the scale can help individuals and institutions to assess experience more accurately, gauge the impact of interventions, and generate effective ways to learn and improve within a health system.
患者报告体验测量指标(PREMs),如杰斐逊医生同理心患者感知量表(JSPPPE)或维克森林医生信任量表(WTPS),存在显著的相互关联和天花板效应(即获得最高分的观察值比例)。当出现高天花板效应时,信息就会丢失,因为得分最高的患者之间几乎肯定至少存在一些测量工具无法测量的差异。识别和量化与患者体验下降相关因素的研究,若采用比现有PREMs具有更大变异性和更低最高分比例(即更有限的天花板效应)的PREM,可能会更有成效。
问题/目的:在一个两阶段过程的第一阶段,我们对一组寻求肌肉骨骼专科护理的患者进行了研究,问题如下:(1)在针对患者体验的二元项目中,哪些项目分组涉及患者体验的相似方面,这些项目源自常用的PREMs?(2)少量代表性项目能否提供一种天花板效应较小(项目难度参数较高)的测量方法?在第二个独立队列中,我们评估所识别的项目在不同队列中是否表现一致,问题如下:(3)新的PREM在项目分组(因子结构)方面表现是否不同,当使用5分制评分量表而非二元量表测量项目时,所包含项目的不同子集是否会提供相同的测量结果(内部一致性)?(4)新的PREM与常用PREMs在调查属性(如天花板效应)和相关性方面有哪些差异?
在2022年和2023年对寻求肌肉骨骼专科护理的患者进行的两项横断面研究中,邀请所有讲英语且能阅读英语、无认知缺陷的18至89岁成年人参加两个连续的独立队列,以帮助开发(初始学习队列)和内部验证(第二个验证队列)一个临时的新PREM。我们为初始学习队列确定了218名符合条件的患者,他们全部完成了所有测量。参与者的平均年龄±标准差为55±16岁,60%(130名)为女性,45%(99名)有私人保险,大多数人因下肢问题(56%[121名])和非创伤性疾病(63%[137名])寻求治疗。我们测量了25个源自其他常用PREMs的项目,这些项目涉及患者体验的各个方面,患者需报告他们是否同意(二元)关于其临床医生的某些陈述。我们进行了探索性因子分析和验证性因子分析(CFA),以识别测量与患者体验相关的相同潜在结构的项目组。然后,我们应用基于项目反应理论的两参数逻辑模型,以识别具有最大变异性(项目难度)的最具区分性的项目,目的是降低天花板效应。我们还进行了差异项目功能分析,以评估特定项目是否被特定患者亚组不一致地评分,这可能会引入偏差。然后,我们招募了154名符合条件的患者进入验证队列,其中99%(153名)完成了所有所需测量,该队列具有相似的人口统计学特征。我们将二元项目改为5点李克特量表,以增加变异性的可能性,试图进一步降低天花板效应,并重复进行CFA。我们还使用双变量分析测量了内部一致性(使用Cronbachα)以及新PREM与其他常用PREMs的相关性。
我们在学习队列中识别出三组项目,分别代表“对临床医生的信任”(13个项目)、“与临床医生的关系”(7个项目)和“参与共同决策”(4个项目)。“对临床医生的信任”因子在所有三个因子中表现最佳,因此被选用于后续分析。我们根据项目难度选择了表现最佳的项目,生成了一个7项简表。我们发现CFA模型拟合良好(13项和7项版本的近似均方根误差[RMSEA]均<0.001),内部一致性良好(13项版本的Cronbachα为0.94,7项版本的Cronbachα为0.91),项目反应理论参数良好(7项版本的项目难度在-0.37至0.16之间,值越高表明天花板效应越低),无局部依赖性,且任何项目之间均无差异项目功能。由于项目反应理论参数较低(项目难度在-1.3至-0.69之间,表明天花板效应较高)、存在多个局部依赖性,且在不解决这些问题的情况下耗尽了项目数量,另外两个因子被排除在测量开发之外。验证队列证实了医生信任与体验量表(TRECS)的13项和7项版本的项目选择和表现适当,CFA模型拟合良好至优秀(RMSEA 0.058[TRECS - 13];RMSEA 0.016[TRECS - 7]),内部一致性优秀(Cronbachα = 0.96[TRECS - 13];Cronbachα = 0.92[TRECS - 7]),无差异项目功能且天花板效应有限(11%[TRECS - 13];14%[TRECS - 7]),并且与其他PREMs如JSPPPE(ρ = 0.77)和WTPS(ρ = 0.74)有显著相关性。
一个相对简短的、聚焦于信任的7项患者体验测量指标可以消除大多数PREMs常见的天花板效应,且具有良好的心理测量特性。未来的研究可以在其他人群中对TRECS进行外部验证,并基于更大的、更能代表寻求肌肉骨骼护理人群的样本量提供基于人群的T分数转换表。
一个基于信任的PREM,能够减少量表高端的信息丢失,有助于个人和机构更准确地评估体验、衡量干预措施的影响,并在卫生系统内生成有效的学习和改进方法。