• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

优化诊断放射学中多类分类的统计评估:双参数多维名义响应模型的研究

Optimizing statistical evaluation of multiclass classification in diagnostic radiology: a study of the two-parameter multidimensional nominal response model.

作者信息

Nishio Mizuho, Ota Eiji

机构信息

Kobe University, Kobe, Japan.

Futaba Numerical Technologies, Iruma, Japan.

出版信息

PeerJ Comput Sci. 2024 Oct 4;10:e2380. doi: 10.7717/peerj-cs.2380. eCollection 2024.

DOI:10.7717/peerj-cs.2380
PMID:39650450
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11623241/
Abstract

PURPOSE

This study aimed to enhance the multidimensional nominal response model (MDNRM) for multiclass classification in diagnostic radiology.

MATERIALS AND METHODS

This retrospective study involved the extension of the conventional nominal response model (NRM) to create the two-parameter MDNRM (2PL-MDNRM). Seven models of MDNRM, including the original MDNRM and subtypes of 2PL-MDNRM, were employed to estimate test-takers' abilities and test item complexity. These models were applied to a clinical diagnostic radiology dataset. Rhat values were calculated to evaluate model convergence. Additionally, values of the widely applicable information criterion (wAIC) and Pareto-smoothed importance sampling leave-one-out cross-validation (LOO) were calculated to evaluate the goodness of fit of the seven models. The best-performing model was selected based on the values of wAIC and LOO. Probability of direction (PD) was used to evaluate whether one estimated parameter significantly differed.

RESULTS

All estimated parameters across the seven models demonstrated Rhat values below 1.10, indicating stable convergence. The best wAIC and LOO values (988 and 1,121, respectively) were achieved with 2PL-MDNRM using the truncated normal distribution and 2PL-MDNRM using the truncated normal distribution. Notably, one test-taker (radiologist) exhibited significantly superior ability compared to another based on PD results from the best models, while no significant difference was observed in nonoptimal models.

CONCLUSION

2PL-MDNRM successfully achieved parameter estimation convergence, and its superiority over the original MDNRM was demonstrated through wAIC and LOO values.

摘要

目的

本研究旨在改进用于放射诊断学多类别分类的多维名义反应模型(MDNRM)。

材料与方法

这项回顾性研究涉及对传统名义反应模型(NRM)进行扩展,以创建双参数MDNRM(2PL-MDNRM)。采用七种MDNRM模型,包括原始MDNRM和2PL-MDNRM的亚型,来估计考生的能力和试题难度。这些模型应用于临床放射诊断数据集。计算Rhat值以评估模型收敛情况。此外,计算广泛适用信息准则(wAIC)值和帕累托平滑重要性抽样留一法交叉验证(LOO)值,以评估这七种模型的拟合优度。根据wAIC和LOO值选择表现最佳的模型。使用方向概率(PD)来评估一个估计参数是否存在显著差异。

结果

七个模型的所有估计参数的Rhat值均低于1.10,表明收敛稳定。使用截断正态分布的2PL-MDNRM和使用截断正态分布的2PL-MDNRM分别获得了最佳的wAIC和LOO值(分别为988和1121)。值得注意的是,根据最佳模型的PD结果,一名考生(放射科医生)表现出明显优于另一名考生的能力,而在非最优模型中未观察到显著差异。

结论

2PL-MDNRM成功实现了参数估计收敛,并且通过wAIC和LOO值证明了其优于原始MDNRM。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5828/11623241/9be0dc6e8b2b/peerj-cs-10-2380-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5828/11623241/bdb826e372ab/peerj-cs-10-2380-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5828/11623241/3da86223437b/peerj-cs-10-2380-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5828/11623241/21f285dca5a9/peerj-cs-10-2380-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5828/11623241/ee37f9634420/peerj-cs-10-2380-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5828/11623241/b50ec1be715b/peerj-cs-10-2380-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5828/11623241/87e12c184b93/peerj-cs-10-2380-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5828/11623241/9be0dc6e8b2b/peerj-cs-10-2380-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5828/11623241/bdb826e372ab/peerj-cs-10-2380-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5828/11623241/3da86223437b/peerj-cs-10-2380-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5828/11623241/21f285dca5a9/peerj-cs-10-2380-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5828/11623241/ee37f9634420/peerj-cs-10-2380-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5828/11623241/b50ec1be715b/peerj-cs-10-2380-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5828/11623241/87e12c184b93/peerj-cs-10-2380-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5828/11623241/9be0dc6e8b2b/peerj-cs-10-2380-g007.jpg

相似文献

1
Optimizing statistical evaluation of multiclass classification in diagnostic radiology: a study of the two-parameter multidimensional nominal response model.优化诊断放射学中多类分类的统计评估:双参数多维名义响应模型的研究
PeerJ Comput Sci. 2024 Oct 4;10:e2380. doi: 10.7717/peerj-cs.2380. eCollection 2024.
2
Bayesian multidimensional nominal response model for observer study of radiologists.贝叶斯多维名义反应模型在放射科医师观察研究中的应用。
Jpn J Radiol. 2023 Apr;41(4):449-455. doi: 10.1007/s11604-022-01366-y. Epub 2022 Dec 5.
3
Model Selection for Mixed-Effects Location-Scale Models with Confidence Interval for LOO or WAIC Difference.具有留一法(LOO)或信息权重平均法(WAIC)差异置信区间的混合效应位置-尺度模型的模型选择
Multivariate Behav Res. 2025 Jul-Aug;60(4):678-694. doi: 10.1080/00273171.2025.2462033. Epub 2025 Feb 18.
4
Identifying the Best Approximating Model in Bayesian Phylogenetics: Bayes Factors, Cross-Validation or wAIC?贝叶斯系统发生学中最佳逼近模型的识别:贝叶斯因子、交叉验证还是 wAIC?
Syst Biol. 2023 Jun 17;72(3):616-638. doi: 10.1093/sysbio/syad004.
5
A Comparison of Monte Carlo Methods for Computing Marginal Likelihoods of Item Response Theory Models.项目反应理论模型边际似然计算的蒙特卡罗方法比较
J Korean Stat Soc. 2019 Dec;48(4):503-512. doi: 10.1016/j.jkss.2019.04.001. Epub 2019 May 17.
6
Using leave-one-out cross validation (LOO) in a multilevel regression and poststratification (MRP) workflow: A cautionary tale.使用多层回归和事后分层 (MRP) 工作流程中的留一法交叉验证 (LOO):一个警示故事。
Stat Med. 2024 Feb 28;43(5):953-982. doi: 10.1002/sim.9964. Epub 2023 Dec 26.
7
Bayesian Dimensionality Assessment for the Multidimensional Nominal Response Model.多维名义响应模型的贝叶斯维度评估
Front Psychol. 2017 Jun 16;8:961. doi: 10.3389/fpsyg.2017.00961. eCollection 2017.
8
Parameter Estimation Accuracy of the Effort-Moderated Item Response Theory Model Under Multiple Assumption Violations.多重假设违背下努力调节项目反应理论模型的参数估计准确性
Educ Psychol Meas. 2021 Jun;81(3):569-594. doi: 10.1177/0013164420949896. Epub 2020 Sep 2.
9
A Note on the Conversion of Item Parameters Standard Errors.关于项目参数标准误转换的注释。
Multivariate Behav Res. 2019 Mar-Apr;54(2):307-321. doi: 10.1080/00273171.2018.1513829. Epub 2018 Dec 21.
10
Comparison between pystan and numpyro in Bayesian item response theory: evaluation of agreement of estimated latent parameters and sampling performance.贝叶斯项目反应理论中PyStan与NumPyro的比较:估计潜在参数的一致性评估与抽样性能
PeerJ Comput Sci. 2023 Oct 5;9:e1620. doi: 10.7717/peerj-cs.1620. eCollection 2023.

本文引用的文献

1
Comparison between pystan and numpyro in Bayesian item response theory: evaluation of agreement of estimated latent parameters and sampling performance.贝叶斯项目反应理论中PyStan与NumPyro的比较:估计潜在参数的一致性评估与抽样性能
PeerJ Comput Sci. 2023 Oct 5;9:e1620. doi: 10.7717/peerj-cs.1620. eCollection 2023.
2
Stan: A Probabilistic Programming Language.斯坦:一种概率编程语言。
J Stat Softw. 2017;76. doi: 10.18637/jss.v076.i01. Epub 2017 Jan 11.
3
Bayesian multidimensional nominal response model for observer study of radiologists.
贝叶斯多维名义反应模型在放射科医师观察研究中的应用。
Jpn J Radiol. 2023 Apr;41(4):449-455. doi: 10.1007/s11604-022-01366-y. Epub 2022 Dec 5.
4
Using Item-Response Theory to Improve Interpretation of the Trans Woman Voice Questionnaire.运用项目反应理论改进跨性别女性声音问卷的解读。
Laryngoscope. 2023 May;133(5):1197-1204. doi: 10.1002/lary.30360. Epub 2022 Aug 29.
5
Deep learning model for the automatic classification of COVID-19 pneumonia, non-COVID-19 pneumonia, and the healthy: a multi-center retrospective study.深度学习模型自动分类 COVID-19 肺炎、非 COVID-19 肺炎和健康人群:一项多中心回顾性研究。
Sci Rep. 2022 May 17;12(1):8214. doi: 10.1038/s41598-022-11990-3.
6
On weakly informative prior distributions for the heterogeneity parameter in Bayesian random-effects meta-analysis.关于贝叶斯随机效应荟萃分析中异质性参数的弱信息先验分布。
Res Synth Methods. 2021 Jul;12(4):448-474. doi: 10.1002/jrsm.1475. Epub 2021 Feb 15.
7
Estimation of Response Styles Using the Multidimensional Nominal Response Model: A Tutorial and Comparison With Sum Scores.使用多维名义反应模型估计反应风格:教程及与总分的比较
Front Psychol. 2020 Feb 6;11:72. doi: 10.3389/fpsyg.2020.00072. eCollection 2020.
8
Indices of Effect Existence and Significance in the Bayesian Framework.贝叶斯框架下效应存在性和显著性的指标
Front Psychol. 2019 Dec 10;10:2767. doi: 10.3389/fpsyg.2019.02767. eCollection 2019.
9
Bayesian Statistical Model of Item Response Theory in Observer Studies of Radiologists.观察者研究中放射科医生的项目反应理论的贝叶斯统计模型。
Acad Radiol. 2020 Mar;27(3):e45-e54. doi: 10.1016/j.acra.2019.04.014. Epub 2019 May 28.
10
Using the Stan Program for Bayesian Item Response Theory.使用斯坦程序进行贝叶斯项目反应理论分析。
Educ Psychol Meas. 2018 Jun;78(3):384-408. doi: 10.1177/0013164417693666. Epub 2017 Feb 1.