• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

心理测量方法的现状:使用项目反应理论开发和完善患者报告结局测量指标

State of the psychometric methods: patient-reported outcome measure development and refinement using item response theory.

作者信息

Stover Angela M, McLeod Lori D, Langer Michelle M, Chen Wen-Hung, Reeve Bryce B

机构信息

Department of Health Policy and Management, University of North Carolina at Chapel Hill, 1101-G McGavran-Greenberg Hall (CB# 7411), Chapel Hill, NC, 27599, USA.

Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill School of Medicine, 101 Manning Drive, Chapel Hill, NC, 27599, USA.

出版信息

J Patient Rep Outcomes. 2019 Jul 30;3(1):50. doi: 10.1186/s41687-019-0130-5.

DOI:10.1186/s41687-019-0130-5
PMID:31359210
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6663947/
Abstract

BACKGROUND

This paper is part of a series comparing different psychometric approaches to evaluate patient-reported outcome (PRO) measures using the same items and dataset. We provide an overview and example application to demonstrate 1) using item response theory (IRT) to identify poor and well performing items; 2) testing if items perform differently based on demographic characteristics (differential item functioning, DIF); and 3) balancing IRT and content validity considerations to select items for short forms.

METHODS

Model fit, local dependence, and DIF were examined for 51 items initially considered for the Patient-Reported Outcomes Measurement Information System® (PROMIS®) Depression item bank. Samejima's graded response model was used to examine how well each item measured severity levels of depression and how well it distinguished between individuals with high and low levels of depression. Two short forms were constructed based on psychometric properties and consensus discussions with instrument developers, including psychometricians and content experts. Calibrations presented here are for didactic purposes and are not intended to replace official PROMIS parameters or to be used for research.

RESULTS

Of the 51 depression items, 14 exhibited local dependence, 3 exhibited DIF for gender, and 9 exhibited misfit, and these items were removed from consideration for short forms. Short form 1 prioritized content, and thus items were chosen to meet DSM-V criteria rather than being discarded for lower discrimination parameters. Short form 2 prioritized well performing items, and thus fewer DSM-V criteria were satisfied. Short forms 1-2 performed similarly for model fit statistics, but short form 2 provided greater item precision.

CONCLUSIONS

IRT is a family of flexible models providing item- and scale-level information, making it a powerful tool for scale construction and refinement. Strengths of IRT models include placing respondents and items on the same metric, testing DIF across demographic or clinical subgroups, and facilitating creation of targeted short forms. Limitations include large sample sizes to obtain stable item parameters, and necessary familiarity with measurement methods to interpret results. Combining psychometric data with stakeholder input (including people with lived experiences of the health condition and clinicians) is highly recommended for scale development and evaluation.

摘要

背景

本文是一个系列文章的一部分,该系列文章比较了使用相同条目和数据集来评估患者报告结局(PRO)指标的不同心理测量方法。我们提供了一个概述和示例应用,以展示:1)使用项目反应理论(IRT)来识别表现不佳和良好的条目;2)测试条目是否根据人口统计学特征表现不同(差异项目功能,DIF);3)平衡IRT和内容效度考量以选择简短形式的条目。

方法

对最初考虑纳入患者报告结局测量信息系统(PROMIS®)抑郁条目库的51个条目进行了模型拟合、局部依赖性和DIF检验。使用Samejima的等级反应模型来检验每个条目对抑郁严重程度的测量效果以及区分抑郁水平高和低的个体的效果。基于心理测量特性以及与包括心理测量学家和内容专家在内的工具开发者的共识讨论,构建了两个简短形式。此处呈现的校准仅用于教学目的,并非旨在取代官方的PROMIS参数或用于研究。

结果

在51个抑郁条目中,14个表现出局部依赖性,3个在性别方面表现出DIF,9个表现出不拟合,这些条目被排除在简短形式的考虑范围之外。简短形式1优先考虑内容,因此选择条目以满足《精神疾病诊断与统计手册》第五版(DSM-V)标准,而不是因为较低的区分参数而被舍弃。简短形式2优先考虑表现良好的条目,因此满足的DSM-V标准较少。简短形式1 - 2在模型拟合统计方面表现相似,但简短形式2提供了更高的条目精度。

结论

IRT是一族灵活的模型,可提供条目和量表层面的信息,使其成为量表构建和完善的强大工具。IRT模型的优势包括将受访者和条目置于同一度量标准上、跨人口统计学或临床亚组测试DIF以及便于创建有针对性的简短形式。局限性包括需要大样本量以获得稳定的条目参数,以及需要熟悉测量方法以解释结果。强烈建议在量表开发和评估中结合心理测量数据与利益相关者的意见(包括有该健康状况实际经历的人和临床医生)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e242/6663947/caac73aa5d1f/41687_2019_130_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e242/6663947/caac73aa5d1f/41687_2019_130_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e242/6663947/caac73aa5d1f/41687_2019_130_Fig2_HTML.jpg

相似文献

1
State of the psychometric methods: patient-reported outcome measure development and refinement using item response theory.心理测量方法的现状:使用项目反应理论开发和完善患者报告结局测量指标
J Patient Rep Outcomes. 2019 Jul 30;3(1):50. doi: 10.1186/s41687-019-0130-5.
2
Measurement Equivalence of the Patient Reported Outcomes Measurement Information System (PROMIS) Pain Interference Short Form Items: Application to Ethnically Diverse Cancer and Palliative Care Populations.患者报告结局测量信息系统(PROMIS)疼痛干扰简表条目的测量等效性:在不同种族癌症和姑息治疗人群中的应用。
Psychol Test Assess Model. 2016;58(2):309-352.
3
Psychometric Properties and Performance of the Patient Reported Outcomes Measurement Information System (PROMIS) Depression Short Forms in Ethnically Diverse Groups.患者报告结局测量信息系统(PROMIS)抑郁简表在不同种族群体中的心理测量特性及表现
Psychol Test Assess Model. 2016;58(1):141-181.
4
Measurement Equivalence of the Patient Reported Outcomes Measurement Information System (PROMIS) Applied Cognition - General Concerns, Short Forms in Ethnically Diverse Groups.患者报告结局测量信息系统(PROMIS)应用认知量表在不同种族群体中的测量等效性——一般问题及简表
Psychol Test Assess Model. 2016;58(2):255-307.
5
Measurement Equivalence of the Patient Reported Outcomes Measurement Information System (PROMIS) Anxiety Short Forms in Ethnically Diverse Groups.患者报告结局测量信息系统(PROMIS)焦虑简表在不同种族群体中的测量等效性
Psychol Test Assess Model. 2016;58(1):183-219.
6
Psychometric evaluation of an Italian custom 4-item short form of the PROMIS anxiety item bank in immune-mediated inflammatory diseases: an item response theory analysis.免疫介导的炎症性疾病中意大利定制的4项PROMIS焦虑项目库简表的心理测量学评估:项目反应理论分析
PeerJ. 2021 Oct 27;9:e12100. doi: 10.7717/peerj.12100. eCollection 2021.
7
8
Differential Item Functioning Analyses of the Patient-Reported Outcomes Measurement Information System (PROMIS®) Measures: Methods, Challenges, Advances, and Future Directions.患者报告结局测量信息系统(PROMIS®)测评的项目区分度分析:方法、挑战、进展及未来方向。
Psychometrika. 2021 Sep;86(3):674-711. doi: 10.1007/s11336-021-09775-0. Epub 2021 Jul 12.
9
Examination of the Measurement Equivalence of the Functional Assessment in Acute Care MCAT (FAMCAT) Mobility Item Bank Using Differential Item Functioning Analyses.使用差异项目功能分析检验急性护理 MCAT(FAMCAT)移动项目库中功能评估的测量等效性。
Arch Phys Med Rehabil. 2022 May;103(5S):S84-S107.e38. doi: 10.1016/j.apmr.2021.03.044. Epub 2021 Jun 16.
10
Psychometric properties of the Dutch-Flemish Patient-Reported Outcomes Measurement Information System Pain Behavior item bank in patients with musculoskeletal complaints.荷兰-弗拉芒患者报告结局测量信息系统疼痛行为项目库在肌肉骨骼疾病患者中的心理测量学特性。
J Pain. 2019 Nov;20(11):1328-1337. doi: 10.1016/j.jpain.2019.05.003. Epub 2019 May 9.

引用本文的文献

1
The Life Impact Burn Recovery Evaluation (LIBRE) Profile: Historical Overview and Future Directions.生活影响烧伤恢复评估(LIBRE)概况:历史回顾与未来方向。
Eur Burn J. 2025 May 14;6(2):23. doi: 10.3390/ebj6020023.
2
Customizing Computerized Adaptive Test Stopping Rules for Clinical Settings Using the Negative Affect Subdomain of the NIH Toolbox Emotion Battery: Simulation Study.使用美国国立卫生研究院工具箱情绪量表的消极情绪子领域为临床环境定制计算机自适应测试停止规则:模拟研究
JMIR Form Res. 2025 Mar 21;9:e60215. doi: 10.2196/60215.
3
Comparative performance of PROMIS Sleep Disturbance computerized adaptive testing algorithms and static short form in postmenopausal women.

本文引用的文献

1
State of the psychometric methods: comments on the ISOQOL SIG psychometric papers.心理测量方法的现状:对国际生活质量研究学会(ISOQOL)特别兴趣小组心理测量论文的评论
J Patient Rep Outcomes. 2019 Jul 30;3(1):49. doi: 10.1186/s41687-019-0134-1.
2
Psychometric evaluation of the PROMIS® Depression Item Bank: an illustration of classical test theory methods.患者报告结果测量信息系统(PROMIS®)抑郁项目库的心理测量学评估:经典测试理论方法示例
J Patient Rep Outcomes. 2019 Jul 30;3(1):46. doi: 10.1186/s41687-019-0127-0.
3
Many ways to skin a cat: psychometric methods options illustrated.
绝经后女性中PROMIS睡眠障碍计算机自适应测试算法与静态简表的比较性能
J Patient Rep Outcomes. 2025 Feb 17;9(1):18. doi: 10.1186/s41687-025-00849-6.
4
Development and validation of the "Adjustment Disorder Scale for Medically Ill Patients - ETAM".《内科疾病患者适应障碍量表 - ETAM》的编制与验证
Front Psychiatry. 2025 Jan 30;16:1482888. doi: 10.3389/fpsyt.2025.1482888. eCollection 2025.
5
Psychometric assessment of scales used to evaluate sexual assault prevention programming in the United States Air Force.对美国空军用于评估性侵犯预防计划的量表进行心理测量评估。
PLoS One. 2025 Jan 16;20(1):e0317557. doi: 10.1371/journal.pone.0317557. eCollection 2025.
6
Pharmacist-facilitated Patient Reported Outcome Measure (PROM) monitoring: developing an EHR SmartForm© to monitor side effects of oral oncolytics during routine telehealth encounters.药剂师协助的患者报告结局测量(PROM)监测:开发电子健康记录智能表单©以监测常规远程医疗会诊期间口服肿瘤药物的副作用。
Qual Life Res. 2025 Jan;34(1):201-217. doi: 10.1007/s11136-024-03789-8. Epub 2024 Oct 15.
7
Psychometric analysis and the implications for the use of the scoliosis research society questionnaire (SRS-22r English) for individuals with adolescent idiopathic scoliosis.心理测量分析以及青少年特发性脊柱侧凸患者使用脊柱侧凸研究学会问卷(SRS - 22r英文版本)的意义。
N Am Spine Soc J. 2024 Jul 31;19:100545. doi: 10.1016/j.xnsj.2024.100545. eCollection 2024 Sep.
8
Rasch Measurement Theory (RMT) Analyses of the Huntington's Disease Everyday Functioning (Hi-DEF) to Evaluate Item Fit and Performance.Rasch 测量理论(RMT)分析亨廷顿病日常生活功能(Hi-DEF),以评估项目拟合度和表现。
J Huntingtons Dis. 2024;13(3):385-397. doi: 10.3233/JHD-240001.
9
Cross-cultural adaptation and validation of the caregiver self-efficacy in contributing to patient self-care scale in China.中国患者自我护理贡献照顾者自我效能感量表的跨文化调适与验证。
BMC Public Health. 2024 Jul 24;24(1):1977. doi: 10.1186/s12889-024-19534-2.
10
Development of short forms for screening children's dental caries and urgent treatment needs using item response theory and machine learning methods.使用项目反应理论和机器学习方法开发用于筛查儿童龋齿和紧急治疗需求的简短形式。
PLoS One. 2024 Mar 22;19(3):e0299947. doi: 10.1371/journal.pone.0299947. eCollection 2024.
达到目的的方法多种多样:心理测量方法示例
J Patient Rep Outcomes. 2019 Jul 30;3(1):48. doi: 10.1186/s41687-019-0133-2.
4
A general Bayesian multilevel multidimensional IRT model for locally dependent data.一种用于局部相关数据的通用贝叶斯多级多维IRT模型。
Br J Math Stat Psychol. 2018 Nov;71(3):536-560. doi: 10.1111/bmsp.12133. Epub 2018 Jun 7.
5
A Monte Carlo Study of an Iterative Wald Test Procedure for DIF Analysis.用于差异项目功能分析的迭代 Wald 检验程序的蒙特卡罗研究。
Educ Psychol Meas. 2017 Jan;77(1):104-118. doi: 10.1177/0013164416637104. Epub 2016 Mar 7.
6
Scale development with small samples: a new application of longitudinal item response theory.小样本量下的量表编制:纵向项目反应理论的新应用。
Qual Life Res. 2018 Jul;27(7):1721-1734. doi: 10.1007/s11136-018-1801-z. Epub 2018 Feb 8.
7
Alternative Approaches to Addressing Non-Normal Distributions in the Application of IRT Models to Personality Measures.解决人格测量中应用IRT 模型时非正态分布问题的其他方法。
J Pers Assess. 2018 Jul-Aug;100(4):363-374. doi: 10.1080/00223891.2017.1381969. Epub 2017 Oct 31.
8
Patient-reported outcome use in oncology: a systematic review of the impact on patient-clinician communication.患者报告结局在肿瘤学中的应用:对患者 - 临床医生沟通影响的系统评价
Support Care Cancer. 2018 Jan;26(1):41-60. doi: 10.1007/s00520-017-3865-7. Epub 2017 Aug 28.
9
Differential item functioning magnitude and impact measures from item response theory models.项目反应理论模型中的差异项目功能大小及影响度量
Psychol Test Assess Model. 2016;58(1):79-98.
10
Routine use of patient reported outcome measures (PROMs) for improving treatment of common mental health disorders in adults.常规使用患者报告结局指标(PROMs)以改善成人常见精神障碍的治疗。
Cochrane Database Syst Rev. 2016 Jul 13;7(7):CD011119. doi: 10.1002/14651858.CD011119.pub2.