Vaccarino Anthony L, Kalali Amir H, Blier Pierre, Gilbert Evans Susan, Engelhardt Nina, Foster Jane A, Frey Benicio N, Greist John H, Kobak Kenneth A, Lam Raymond W, MacQueen Glenda, Milev Roumen, Müller Daniel J, Parikh Sagar V, Placenza Franca M, Rizvi Sakina J, Rotzinger Susan, Sheehan David V, Sills Terrence, Soares Claudio N, Turecki Gustavo, Uher Rudolph, Williams Janet B W, Kennedy Sidney H, Evans Kenneth R
Drs. Vaccarino, Evans and Gilbert Evans are with Indoc Research in Toronto, Ontario, Canada.
Dr Kalali is with the International Society for CNS Drug Development in San Diego, California.
Innov Clin Neurosci. 2020 Jul 1;17(7-9):30-40.
The goal of the Depression Inventory Development (DID) project is to develop a comprehensive and psychometrically sound rating scale for major depressive disorder (MDD) that reflects current diagnostic criteria and conceptualizations of depression. We report here the evaluation of the current DID item bank using Classical Test Theory (CTT), Item Response Theory (IRT) and Rasch Measurement Theory (RMT). The present study was part of a larger multisite, open-label study conducted by the Canadian Biomarker Integration Network in Depression (ClinicalTrials.gov: NCT01655706). Trained raters administered the 32 DID items at each of two visits (MDD: baseline, n=211 and Week 8, n=177; healthy participants: baseline, n=112 and Week 8, n=104). The DID's "grid" structure operationalizes intensity and frequency of each item, with clear symptom definitions and a structured interview guide, with the current iteration assessing symptoms related to and . Participants were also administered the Montgomery- Åsberg Depression Rating Scale (MADRS) and Quick Inventory of Depressive Symptomatology-Self-Report (QIDS-SR) that allowed DID items to be evaluated against existing "benchmark" items. CTT was used to assess data quality/reliability (i.e., missing data, skewness, scoring frequency, internal consistency), IRT to assess individual item performance by modelling an item's ability to discriminate levels of depressive severity (as assessed by the MADRS), and RMT to assess how the items perform together as a scale to capture a range of depressive severity (item targeting). These analyses together provided empirical evidence to base decisions on which DID items to remove, modify, or advance. Of the 32 DID items evaluated, eight items were identified by CTT as problematic, displaying low variability in the range of responses, floor effects, and/or skewness; and four items were identified by IRT to show poor discriminative properties that would limit their clinical utility. Five additional items were deemed to be redundant. The remaining 15 DID items all fit the Rasch model, with person and item difficulty estimates indicating satisfactory item targeting, with lower precision in participants with mild levels of depression. These 15 DID items also showed good internal consistency (alpha=0.95 and inter-item correlations ranging from r=0.49 to r=0.84) and all items were sensitive to change following antidepressant treatment (baseline vs. Week 8). RMT revealed problematic item targeting for the MADRS and QIDSSR, including an absence of MADRS items targeting participants with mild/moderate depression and an absence of QIDS-SR items targeting participants with mild or severe depression. The present study applied CTT, IRT, and RMT to assess the measurement properties of the DID items and identify those that should be advanced, modified, or removed. Of the 32 items evaluated, 15 items showed good measurement properties. These items (along with previously evaluated items) will provide the basis for validation of a penultimate DID scale assessing and . The strategies adopted by the DID process provide a framework for rating scale development and validation.
抑郁量表开发(DID)项目的目标是为重度抑郁症(MDD)制定一个全面且心理测量学上合理的评定量表,该量表应反映当前抑郁症的诊断标准和概念。我们在此报告使用经典测验理论(CTT)、项目反应理论(IRT)和拉施测量理论(RMT)对当前DID题库的评估。本研究是加拿大抑郁症生物标志物整合网络开展的一项更大规模的多中心、开放标签研究的一部分(ClinicalTrials.gov:NCT01655706)。经过培训的评估者在两次访视时分别对32个DID项目进行评定(MDD患者:基线期,n = 211;第8周,n = 177;健康参与者:基线期,n = 112;第8周,n = 104)。DID的“网格”结构对每个项目的强度和频率进行了操作化定义,具有清晰的症状定义和结构化访谈指南,当前版本评估与[具体内容缺失]和[具体内容缺失]相关的症状。参与者还接受了蒙哥马利 - 阿斯伯格抑郁评定量表(MADRS)和抑郁症状快速自评量表(QIDS - SR),以便将DID项目与现有的“基准”项目进行对照评估。CTT用于评估数据质量/可靠性(即缺失数据情况、偏度、评分频率、内部一致性),IRT通过对项目区分抑郁严重程度水平的能力进行建模(由MADRS评估)来评估单个项目的表现,RMT用于评估这些项目作为一个量表共同捕捉一系列抑郁严重程度的表现(项目定位)。这些分析共同提供了实证依据,以便决定保留、修改或推进哪些DID项目。在评估的32个DID项目中,CTT确定有8个项目存在问题,在反应范围内变异性低、存在地板效应和/或偏度;IRT确定有4个项目显示出较差的区分特性,这将限制它们的临床效用。另外5个项目被认为是多余的。其余15个DID项目均符合拉施模型,人员和项目难度估计表明项目定位令人满意,但在轻度抑郁水平的参与者中精度较低。这15个DID项目还显示出良好的内部一致性(α = 0.95,项目间相关性范围为r = 0.49至r = 0.84),并且所有项目在抗抑郁治疗后(基线期与第8周)对变化敏感。RMT揭示了MADRS和QIDS - SR存在项目定位问题,包括缺少针对轻度/中度抑郁参与者的MADRS项目以及缺少针对轻度或重度抑郁参与者的QIDS - SR项目。本研究应用CTT、IRT和RMT来评估DID项目的测量特性,并确定应推进、修改或删除的项目。在评估的32个项目中,15个项目显示出良好的测量特性。这些项目(连同之前评估的项目)将为评估[具体内容缺失]和[具体内容缺失]的倒数第二个DID量表的验证提供基础。DID过程采用的策略为评定量表的开发和验证提供了一个框架。