• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

心理测量方法的现状:对国际生活质量研究学会(ISOQOL)特别兴趣小组心理测量论文的评论

State of the psychometric methods: comments on the ISOQOL SIG psychometric papers.

作者信息

Bjorner Jakob B

机构信息

Optum Patient Insights, Johnston, USA.

Department of Public Health, University of Copenhagen, Copenhagen, Denmark.

出版信息

J Patient Rep Outcomes. 2019 Jul 30;3(1):49. doi: 10.1186/s41687-019-0134-1.

DOI:10.1186/s41687-019-0134-1
PMID:31359221
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6663952/
Abstract

BACKGROUND

Psychometric analyses of patient reported outcomes typically use either classical test theory (CTT), item response theory (IRT), or Rasch measurement theory (RTM). The three papers from the ISOQOL Psychometrics SIG examined the same data set using the tree different approaches. By comparing the results from these papers, the current paper aims to examine the extent to which conclusions about the validity and reliability of a PRO tool depends on the selected psychometric approach.

MAIN TEXT

Regarding the basic statistical model, IRT and RTM are relatively similar but differ notably from CTT. However, modern applications of CTT diminish these differences. In analyses of item discrimination, CTT and IRT gave very similar results, while RTM requires equal discrimination and therefore suggested exclusion of items deviating too much from this requirement. Thus, fewer items fitted the Rasch model. In analyses of item thresholds (difficulty), IRT and RMT provided fairly similar results. Item thresholds are typically not evaluated in CTT. Analyses of local dependence showed only moderate agreement between methods, partly due to different thresholds for important local dependence. Analyses of differential item function (DIF) showed good agreement between IRT and RMT. Agreement might be further improved by adjusting the thresholds for important DIF. Analyses of measurement precision across the score range showed high agreement between IRT and RMT methods. CTT assumes constant measurement precision throughout the score range and thus gave different results. Category orderings were examined in RMT analyses by checking for reversed thresholds. However, this approach is controversial within the RMT society. The same issue can be examined by the nominal categories IRT model.

CONCLUSIONS

While there are well-known differences between CTT, IRT and RMT, the comparison between three actual analyses revealed a great deal of agreement between the results from the methods. If the undogmatic attitude of the three current papers is maintained, the field will be well served.

摘要

背景

对患者报告结局的心理测量分析通常采用经典测验理论(CTT)、项目反应理论(IRT)或拉施测量理论(RTM)。来自国际生活质量研究学会心理测量学特别兴趣小组的三篇论文使用这三种不同方法分析了同一数据集。通过比较这些论文的结果,本文旨在研究关于患者报告结局工具的效度和信度的结论在多大程度上取决于所选的心理测量方法。

正文

关于基本统计模型,IRT和RTM相对相似,但与CTT有显著差异。然而,CTT的现代应用缩小了这些差异。在项目区分度分析中,CTT和IRT得出了非常相似的结果,而RTM要求同等区分度,因此建议排除与该要求偏差过大的项目。因此,符合拉施模型的项目较少。在项目阈值(难度)分析中,IRT和RMT提供了相当相似的结果。CTT通常不评估项目阈值。局部依赖性分析表明,各方法之间只有适度的一致性,部分原因是重要局部依赖性的阈值不同。差异项目功能(DIF)分析表明,IRT和RMT之间有良好的一致性。通过调整重要DIF的阈值,一致性可能会进一步提高。在整个分数范围内的测量精度分析表明,IRT和RMT方法之间有高度一致性。CTT假定在整个分数范围内测量精度恒定,因此得出了不同的结果。在RTM分析中,通过检查阈值是否反转来检验类别排序。然而,这种方法在RTM学界存在争议。同一问题可以通过名义类别IRT模型进行检验。

结论

虽然CTT、IRT和RMT之间存在众所周知的差异,但三项实际分析的比较显示,这些方法的结果之间有大量的一致性。如果保持当前这三篇论文的开放态度,该领域将受益匪浅。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/d0c0a2cab770/41687_2019_134_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/c4cb5ccb134e/41687_2019_134_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/88f66d73d1c4/41687_2019_134_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/1acc6425e5b9/41687_2019_134_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/121738264573/41687_2019_134_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/b8a1683f11b7/41687_2019_134_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/a5adb1d7700d/41687_2019_134_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/455ebc8efa43/41687_2019_134_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/d0c0a2cab770/41687_2019_134_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/c4cb5ccb134e/41687_2019_134_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/88f66d73d1c4/41687_2019_134_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/1acc6425e5b9/41687_2019_134_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/121738264573/41687_2019_134_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/b8a1683f11b7/41687_2019_134_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/a5adb1d7700d/41687_2019_134_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/455ebc8efa43/41687_2019_134_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8e4/6663952/d0c0a2cab770/41687_2019_134_Fig8_HTML.jpg

相似文献

1
State of the psychometric methods: comments on the ISOQOL SIG psychometric papers.心理测量方法的现状:对国际生活质量研究学会(ISOQOL)特别兴趣小组心理测量论文的评论
J Patient Rep Outcomes. 2019 Jul 30;3(1):49. doi: 10.1186/s41687-019-0134-1.
2
Using classical test theory, item response theory, and Rasch measurement theory to evaluate patient-reported outcome measures: a comparison of worked examples.运用经典测试理论、项目反应理论和拉施测量理论评估患者报告的结局指标:实例比较
Value Health. 2015 Jan;18(1):25-34. doi: 10.1016/j.jval.2014.10.005.
3
THE DEPRESSION INVENTORY DEVELOPMENT SCALE: Assessment of Psychometric Properties Using Classical and Modern Measurement Theory in a CAN-BIND Trial.抑郁量表发展量表:在CAN - BIND试验中使用经典和现代测量理论对心理测量特性进行评估
Innov Clin Neurosci. 2020 Jul 1;17(7-9):30-40.
4
Methodological issues regarding power of classical test theory (CTT) and item response theory (IRT)-based approaches for the comparison of patient-reported outcomes in two groups of patients--a simulation study.关于经典测量理论(CTT)和项目反应理论(IRT)方法在两组患者间比较患者报告结局的功效的方法学问题——一项模拟研究。
BMC Med Res Methodol. 2010 Mar 25;10:24. doi: 10.1186/1471-2288-10-24.
5
Measuring the ICF components of impairment, activity limitation and participation restriction: an item analysis using classical test theory and item response theory.测量损伤、活动受限和参与限制的国际功能、残疾和健康分类(ICF)成分:使用经典测试理论和项目反应理论的项目分析
Health Qual Life Outcomes. 2009 May 7;7:41. doi: 10.1186/1477-7525-7-41.
6
Rasch model of the bridging social capital questionnaire.桥梁社会资本问卷的拉施模型
SSM Popul Health. 2021 Apr 8;14:100791. doi: 10.1016/j.ssmph.2021.100791. eCollection 2021 Jun.
7
Evaluation properties of the French version of the OUT-PATSAT35 satisfaction with care questionnaire according to classical and item response theory analyses.根据经典理论和项目反应理论分析,评估法语版门诊患者满意度调查问卷(OUT-PATSAT35)的测评属性。
Qual Life Res. 2014 Sep;23(7):2089-101. doi: 10.1007/s11136-014-0658-z. Epub 2014 Mar 7.
8
Response pattern of depressive symptoms among college students: What lies behind items of the Beck Depression Inventory-II?大学生抑郁症状的反应模式:贝克抑郁量表二的项目背后隐藏着什么?
J Affect Disord. 2018 Jul;234:124-130. doi: 10.1016/j.jad.2018.02.064. Epub 2018 Mar 3.
9
State of the psychometric methods: patient-reported outcome measure development and refinement using item response theory.心理测量方法的现状:使用项目反应理论开发和完善患者报告结局测量指标
J Patient Rep Outcomes. 2019 Jul 30;3(1):50. doi: 10.1186/s41687-019-0130-5.
10
Validation of the Toronto Empathy Questionnaire (TEQ) Among Medical Students in China: Analyses Using Three Psychometric Methods.多伦多共情问卷(TEQ)在中国医学生中的效度验证:采用三种心理测量方法的分析
Front Psychol. 2020 Apr 28;11:810. doi: 10.3389/fpsyg.2020.00810. eCollection 2020.

引用本文的文献

1
Clinical Validation of Novel Digital Measures: Statistical Methods for Reliability Evaluation.新型数字测量方法的临床验证:可靠性评估的统计方法
Digit Biomark. 2023 Aug 9;7(1):74-91. doi: 10.1159/000531054. eCollection 2023 Jan-Dec.
2
Validation of the Chinese EORTC chronic lymphocytic leukaemia module - application of classical test theory and item response theory.中文 EORTC 慢性淋巴细胞白血病模块的验证 - 经典测试理论和项目反应理论的应用。
Health Qual Life Outcomes. 2020 Apr 7;18(1):96. doi: 10.1186/s12955-020-01341-z.
3
State of the psychometric methods: patient-reported outcome measure development and refinement using item response theory.

本文引用的文献

1
Order-Constrained Estimation of Nominal Response Model Parameters to Assess the Empirical Order of Categories.用于评估类别经验顺序的名义响应模型参数的序约束估计。
Educ Psychol Meas. 2018 Oct;78(5):826-856. doi: 10.1177/0013164417714296. Epub 2017 Jun 19.
2
Psychometric evaluation of the PROMIS® Depression Item Bank: an illustration of classical test theory methods.患者报告结果测量信息系统(PROMIS®)抑郁项目库的心理测量学评估:经典测试理论方法示例
J Patient Rep Outcomes. 2019 Jul 30;3(1):46. doi: 10.1186/s41687-019-0127-0.
3
State of the psychometric methods: patient-reported outcome measure development and refinement using item response theory.
心理测量方法的现状:使用项目反应理论开发和完善患者报告结局测量指标
J Patient Rep Outcomes. 2019 Jul 30;3(1):50. doi: 10.1186/s41687-019-0130-5.
心理测量方法的现状:使用项目反应理论开发和完善患者报告结局测量指标
J Patient Rep Outcomes. 2019 Jul 30;3(1):50. doi: 10.1186/s41687-019-0130-5.
4
Psychometric performance of the PROMIS® depression item bank: a comparison of the 28- and 51-item versions using Rasch measurement theory.患者报告结果测量信息系统(PROMIS®)抑郁项目库的心理测量学性能:使用拉施测量理论对28项和51项版本的比较
J Patient Rep Outcomes. 2019 Jul 30;3(1):47. doi: 10.1186/s41687-019-0131-4.
5
Many ways to skin a cat: psychometric methods options illustrated.达到目的的方法多种多样:心理测量方法示例
J Patient Rep Outcomes. 2019 Jul 30;3(1):48. doi: 10.1186/s41687-019-0133-2.
6
An Analysis of (Dis)Ordered Categories, Thresholds, and Crossings in Difference and Divide-by-Total IRT Models for Ordered Responses.有序反应的差异和总分为一IRT模型中(无序)类别、阈值及交叉点分析
Span J Psychol. 2017 Feb 13;20:E10. doi: 10.1017/sjp.2017.11.
7
Reversed thresholds in partial credit models: a reason for collapsing categories?部分计分模型中的反向阈值:类别合并的一个原因?
Assessment. 2014 Dec;21(6):765-74. doi: 10.1177/1073191114530775. Epub 2014 Apr 30.
8
On the Use, the Misuse, and the Very Limited Usefulness of Cronbach's Alpha.论克朗巴哈α系数的使用、误用及非常有限的实用性。
Psychometrika. 2009 Mar;74(1):107-120. doi: 10.1007/s11336-008-9101-0. Epub 2008 Dec 11.
9
Translating health status questionnaires and evaluating their quality: the IQOLA Project approach. International Quality of Life Assessment.翻译健康状况调查问卷并评估其质量:IQOLA项目方法。国际生活质量评估。
J Clin Epidemiol. 1998 Nov;51(11):913-23. doi: 10.1016/s0895-4356(98)00082-1.