Tracy K, Adler L A, Rotrosen J, Edson R, Lavori P
Psychiatry Service, VA Medical Center/NYU School of Medicine, NY 10010, USA.
Psychopharmacol Bull. 1997;33(1):53-7.
This article describes a standardized method for establishing and maintaining desired levels of interrater reliability (IRR) in multicenter trials. The procedure involves six steps: distribution of procedural guides, distribution of an introduction tape, initial distribution of patient interviews to rate, training at the study kickoff meeting, ongoing IRR monitoring, and group training throughout the study. This method is being used in a national Veterans Affairs Cooperative Study (CS #394), involving nine sites to examine the treatment effects of vitamin E on tardive dyskinesia. The six-step standardized process allowed for early detection of areas of concern in assessment administration. When comparing intraclass correlation coefficients (ICCs) at different points in the initial training, the Barnes Akathisia Scale and Anchored Brief Psychiatric Rating Scale reliability improved from 0.68 to 0.74 and from 0.54 to 0.87, respectively. After analyzing the ratings collected prior to the start of CS #394, data were collected to conduct the first check on Abnormal Involuntary Movement Scale (AIMS) IRR during enrollment; the estimated ICC for the AIMS had decreased from 0.87 to 0.60. Raters were instructed to re-assess the subjects from the first videotape on the AIMS and received additional training. The re-rating indicated very good reliability, 0.84, IRR was measured once for the Global Assessment of Functioning Scale resulting in an ICC of 0.90. The companion article (Part II: Edson et al. 1997, page 59 of this issue) describes the statistical procedures used to measure IRR.
本文介绍了一种在多中心试验中建立和维持所需评分者间信度(IRR)水平的标准化方法。该程序包括六个步骤:分发程序指南、分发介绍录像带、初步分发患者访谈以供评分、在研究启动会议上进行培训、持续的IRR监测以及在整个研究过程中进行小组培训。这种方法正在一项全国退伍军人事务合作研究(CS #394)中使用,该研究涉及九个地点,旨在研究维生素E对迟发性运动障碍的治疗效果。这一六个步骤的标准化流程有助于早期发现评估管理中令人担忧的领域。在比较初始培训不同阶段的组内相关系数(ICC)时,巴恩斯静坐不能量表和简明定式精神病评定量表的信度分别从0.68提高到0.74以及从0.54提高到0.87。在分析CS #394开始前收集的评分后,在入组期间收集数据以对异常不自主运动量表(AIMS)的IRR进行首次检查;AIMS的估计ICC从0.87降至0.60。评分者被要求根据第一份录像带重新评估受试者在AIMS上的表现,并接受了额外培训。重新评分显示信度非常好,为0.84,对功能总体评定量表仅测量了一次IRR,ICC为0.90。配套文章(第二部分:埃德森等人,1997年,本期第59页)描述了用于测量IRR的统计程序。