Nationwide Children's Hospital, Columbus, OH, USA.
University of Missouri, Department of Educational, School and Counseling Psychology, Columbia, MO, USA.
J Affect Disord. 2023 Dec 1;342:76-84. doi: 10.1016/j.jad.2023.09.013. Epub 2023 Sep 13.
Technically sound measures are necessary for accurately identifying youth at risk for depression, but many studies rely on classical test theory metrics or adult samples to evaluate measures. This study examined the use of the PHQ-8, a common and freely available pediatric depression screener, in an adolescent sample using item response theory (IRT).
Secondary analyses were conducted on data from a study conducted in Midwestern middle schools in which 1224 youth completed the PHQ-8 as part of a battery of surveys. Polytomous IRT analyses (a Graded Response Model) were used to evaluate the PHQ-8. Items were examined for their ability to distinguish between respondents of different latent depression severity and for differential item functioning (DIF) across demographic categories.
All PHQ-8 items had adequate discriminative abilities. Items measuring anhedonia and psychomotor disturbances performed relatively poorly, and items measuring somatic symptoms (appetite and sleep) were most informative when respondents endorsed extreme response options ("not at all" or "nearly every day"). No DIF was found across grade level or race, but several items were flagged for DIF by gender and student income level.
These results might not be generalizable to a broader youth population due to administration setting and the unique demographic characteristics of this sample (76.0 % African American).
Tools such as the PHQ-8 are appropriate to quickly screen for depression in adolescents, but further scrutiny of adolescent response patterns is warranted. Future research should examine items measuring anhedonia and psychomotor and somatic disturbances in adolescents.
准确识别有抑郁风险的青少年需要技术上合理的措施,但许多研究依赖于经典测试理论指标或成人样本来评估这些措施。本研究使用项目反应理论(IRT)在青少年样本中检查了常用的且免费的儿童抑郁筛查工具 PHQ-8 的使用情况。
对在中西部中学进行的一项研究中的数据进行了二次分析,其中 1224 名青少年作为一系列调查的一部分完成了 PHQ-8。使用多项IRT 分析(等级反应模型)来评估 PHQ-8。评估了项目区分不同潜在抑郁严重程度受访者的能力,以及跨人口统计学类别(demographic categories)的差异项目功能(DIF)。
所有 PHQ-8 项目都具有足够的区分能力。衡量快感缺乏和精神运动障碍的项目表现相对较差,而当受访者选择极端的反应选项(“一点也不”或“几乎每天”)时,衡量躯体症状(食欲和睡眠)的项目则更具信息量。没有发现年级或种族之间的 DIF,但几个项目因性别和学生收入水平而被标记为 DIF。
由于管理环境和样本的独特人口统计学特征(76.0%的非洲裔美国人),这些结果可能不适用于更广泛的青少年人群。
PHQ-8 等工具适合快速筛查青少年的抑郁,但需要进一步审查青少年的反应模式。未来的研究应该检查在青少年中衡量快感缺乏、精神运动和躯体障碍的项目。