Bagby R Michael, Ryder Andrew G, Schuller Deborah R, Marshall Margarita B
Centre for Addiction and Mental Health, 250 College St., Toronto, Ont., Canada M5T 1R8.
Am J Psychiatry. 2004 Dec;161(12):2163-77. doi: 10.1176/appi.ajp.161.12.2163.
The Hamilton Depression Rating Scale has been the gold standard for the assessment of depression for more than 40 years. Criticism of the instrument has been increasing. The authors review studies published since the last major review of this instrument in 1979 that explicitly examine the psychometric properties of the Hamilton depression scale. The authors' goal is to determine whether continued use of the Hamilton depression scale as a measure of treatment outcome is justified.
MEDLINE was searched for studies published since 1979 that examine psychometric properties of the Hamilton depression scale. Seventy studies were identified and selected, and then grouped into three categories on the basis of the major psychometric properties examined-reliability, item-response characteristics, and validity.
The Hamilton depression scale's internal reliability is adequate, but many scale items are poor contributors to the measurement of depression severity; others have poor interrater and retest reliability. For many items, the format for response options is not optimal. Content validity is poor; convergent validity and discriminant validity are adequate. The factor structure of the Hamilton depression scale is multidimensional but with poor replication across samples.
Evidence suggests that the Hamilton depression scale is psychometrically and conceptually flawed. The breadth and severity of the problems militate against efforts to revise the current instrument. After more than 40 years, it is time to embrace a new gold standard for assessment of depression.
汉密尔顿抑郁量表40多年来一直是评估抑郁症的金标准。对该工具的批评日益增多。作者回顾了自1979年对该工具进行上次重大综述以来发表的明确检验汉密尔顿抑郁量表心理测量特性的研究。作者的目标是确定继续使用汉密尔顿抑郁量表作为治疗结果的衡量指标是否合理。
检索MEDLINE中自1979年以来发表的检验汉密尔顿抑郁量表心理测量特性的研究。确定并选择了70项研究,然后根据所检验的主要心理测量特性——信度、项目反应特征和效度,将其分为三类。
汉密尔顿抑郁量表的内部信度足够,但许多量表项目对抑郁严重程度测量的贡献不大;其他项目的评分者间信度和重测信度较差。对于许多项目,反应选项的格式并非最佳。内容效度较差;聚合效度和区分效度足够。汉密尔顿抑郁量表的因子结构是多维的,但在不同样本间的重复性较差。
有证据表明,汉密尔顿抑郁量表在心理测量和概念上存在缺陷。这些问题的广度和严重性不利于对现有工具进行修订。40多年过去了,是时候采用一种新的抑郁症评估金标准了。