Abbott Amy Ann, Shin Julia, Carlson Kathryn, Russell Marion, Qi Yongyue, Storm Hannah, Jewell Vanessa Dawn
Creighton University, Omaha, NE, USA.
Mesa Developmental Services, Grand Junction, CO, USA.
Br J Occup Ther. 2025 Mar;88(3):133-141. doi: 10.1177/03080226241283292. Epub 2024 Nov 9.
Establishing inter-rater agreement and reliability ascertains that multiple raters consistently evaluate observed interventions to ensure that clinical research protocols are delivered as intended by the trial protocol.
Using the Guidelines for Reporting Reliability and Agreement Studies, we (a) exemplified the steps to establish inter-rater reliability and inter-rater agreement on the occupation-based coaching Video Evaluation Tool and (b) evaluated best practices that promoted high inter-rater reliability and inter-rater agreement between blinded raters prior to starting a pilot randomized controlled trial. The randomized controlled trial examined the preliminary effectiveness of occupation-based coaching via telehealth for rural families with children living with type 1 diabetes to improve family quality of life, participation, self-efficacy, and child health outcomes.
We created a library of 13 occupation-based coaching videos portraying a range of evaluations, scores, and ratings. The inter-rater agreement and reliability on the occupation-based coaching Video Evaluation Tool were established through the iterations of (a) blinded rater training, (b) data collection using the tool, and (c) statistical analysis using Cohen's kappa and Cronbach's alpha.
Occurrence and Non-Occurrence Checklist (κ = 0.881, < 0.001); "Caregiver Talk" and "Interventionist Talk Analysis" (ICC = 0.991-0.999, < 0.001); Evidence of Independent Capacity Rating (ICC = 0.867 = 0.006).
Strong inter-rater reliability and inter-rater agreement was established by engaging two blinded raters through multifaceted training, integrating real-life clients and contexts into the instrumentation and training, and precisely defined rubric criteria. By employing such practices, high inter-rater reliability and agreement can be achieved in clinical research involving interventions and instruments that are highly subjective and individualized. To ascertain greater scientific confidence in the intervention effect, developing a multidomain fidelity framework and establishing high inter-rater agreement and reliability in the instruments a priori to implementation of clinical trials are necessary.
建立评分者间的一致性和可靠性可确保多个评分者对观察到的干预措施进行一致评估,以确保临床研究方案按试验方案的预期实施。
我们依据《报告可靠性和一致性研究指南》,(a)举例说明了在基于职业的指导视频评估工具上建立评分者间可靠性和评分者间一致性的步骤,(b)在开展一项试点随机对照试验之前,评估了促进盲法评分者之间高评分者间可靠性和评分者间一致性的最佳实践。该随机对照试验考察了通过远程医疗为患有1型糖尿病儿童的农村家庭提供基于职业的指导对改善家庭生活质量、参与度、自我效能感和儿童健康结局的初步效果。
我们创建了一个包含13个基于职业的指导视频的库,这些视频展示了一系列评估、分数和评级。通过(a)盲法评分者培训、(b)使用该工具进行数据收集以及(c)使用科恩kappa系数和克朗巴赫alpha系数进行统计分析等迭代过程,建立了基于职业的指导视频评估工具的评分者间一致性和可靠性。
出现与未出现清单(κ = 0.881,P < 0.001);“照顾者谈话”和“干预者谈话分析”(组内相关系数ICC = 0.991 - 0.999,P < 0.001);独立能力评级证据(ICC = 0.867,P = 0.006)。
通过让两名盲法评分者参与多方面培训,将现实生活中的客户和情境纳入工具和培训中,并精确界定评分标准,建立了强大的评分者间可靠性和评分者间一致性。通过采用这些实践,在涉及高度主观和个体化的干预措施和工具的临床研究中,可以实现高评分者间可靠性和一致性。为了在干预效果方面获得更高的科学可信度,在实施临床试验之前,制定一个多领域保真度框架并在工具中建立高评分者间一致性和可靠性是必要的。