Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec, Canada.
Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montréal, Québec, Canada.
Psychother Psychosom. 2020;89(1):25-37. doi: 10.1159/000502294. Epub 2019 Oct 8.
Screening for major depression with the Patient Health Questionnaire-9 (PHQ-9) can be done using a cutoff or the PHQ-9 diagnostic algorithm. Many primary studies publish results for only one approach, and previous meta-analyses of the algorithm approach included only a subset of primary studies that collected data and could have published results.
To use an individual participant data meta-analysis to evaluate the accuracy of two PHQ-9 diagnostic algorithms for detecting major depression and compare accuracy between the algorithms and the standard PHQ-9 cutoff score of ≥10.
Medline, Medline In-Process and Other Non-Indexed Citations, PsycINFO, Web of Science (January 1, 2000, to February 7, 2015). Eligible studies that classified current major depression status using a validated diagnostic interview.
Data were included for 54 of 72 identified eligible studies (n participants = 16,688, n cases = 2,091). Among studies that used a semi-structured interview, pooled sensitivity and specificity (95% confidence interval) were 0.57 (0.49, 0.64) and 0.95 (0.94, 0.97) for the original algorithm and 0.61 (0.54, 0.68) and 0.95 (0.93, 0.96) for a modified algorithm. Algorithm sensitivity was 0.22-0.24 lower compared to fully structured interviews and 0.06-0.07 lower compared to the Mini International Neuropsychiatric Interview. Specificity was similar across reference standards. For PHQ-9 cutoff of ≥10 compared to semi-structured interviews, sensitivity and specificity (95% confidence interval) were 0.88 (0.82-0.92) and 0.86 (0.82-0.88).
The cutoff score approach appears to be a better option than a PHQ-9 algorithm for detecting major depression.
使用病人健康问卷-9(PHQ-9)进行主要抑郁症筛查,可以使用截止值或 PHQ-9 诊断算法。许多主要研究仅发布一种方法的结果,并且之前使用算法方法进行的荟萃分析仅包括收集数据并可能发表结果的主要研究的一个子集。
使用个体参与者数据荟萃分析评估两种 PHQ-9 诊断算法检测重度抑郁症的准确性,并比较算法和标准 PHQ-9 截止值≥10 之间的准确性。
Medline、Medline In-Process 和其他非索引引文、PsycINFO、Web of Science(2000 年 1 月 1 日至 2015 年 2 月 7 日)。合格的研究使用经过验证的诊断访谈来分类当前的重度抑郁症状态。
纳入了 72 项合格研究中的 54 项(n 参与者=16688,n 病例=2091)。在使用半结构化访谈的研究中,原始算法的 pooled 敏感性和特异性(95%置信区间)为 0.57(0.49,0.64)和 0.95(0.94,0.97),而改良算法为 0.61(0.54,0.68)和 0.95(0.93,0.96)。与完全结构化访谈相比,算法敏感性低 0.22-0.24,与 Mini 国际神经精神病访谈相比低 0.06-0.07。特异性在参考标准之间相似。与半结构化访谈相比,PHQ-9 截止值≥10 的敏感性和特异性(95%置信区间)分别为 0.88(0.82-0.92)和 0.86(0.82-0.88)。
与 PHQ-9 算法相比,截止值方法似乎是检测重度抑郁症的更好选择。