优化诊断放射学中多类分类的统计评估：双参数多维名义响应模型的研究

Optimizing statistical evaluation of multiclass classification in diagnostic radiology: a study of the two-parameter multidimensional nominal response model.

作者信息

Nishio Mizuho, Ota Eiji

机构信息

Kobe University, Kobe, Japan.

Futaba Numerical Technologies, Iruma, Japan.

出版信息

PeerJ Comput Sci. 2024 Oct 4;10:e2380. doi: 10.7717/peerj-cs.2380. eCollection 2024.

DOI:10.7717/peerj-cs.2380

PMID:39650450

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11623241/

Abstract

PURPOSE

This study aimed to enhance the multidimensional nominal response model (MDNRM) for multiclass classification in diagnostic radiology.

MATERIALS AND METHODS

This retrospective study involved the extension of the conventional nominal response model (NRM) to create the two-parameter MDNRM (2PL-MDNRM). Seven models of MDNRM, including the original MDNRM and subtypes of 2PL-MDNRM, were employed to estimate test-takers' abilities and test item complexity. These models were applied to a clinical diagnostic radiology dataset. Rhat values were calculated to evaluate model convergence. Additionally, values of the widely applicable information criterion (wAIC) and Pareto-smoothed importance sampling leave-one-out cross-validation (LOO) were calculated to evaluate the goodness of fit of the seven models. The best-performing model was selected based on the values of wAIC and LOO. Probability of direction (PD) was used to evaluate whether one estimated parameter significantly differed.

RESULTS

All estimated parameters across the seven models demonstrated Rhat values below 1.10, indicating stable convergence. The best wAIC and LOO values (988 and 1,121, respectively) were achieved with 2PL-MDNRM using the truncated normal distribution and 2PL-MDNRM using the truncated normal distribution. Notably, one test-taker (radiologist) exhibited significantly superior ability compared to another based on PD results from the best models, while no significant difference was observed in nonoptimal models.

CONCLUSION

2PL-MDNRM successfully achieved parameter estimation convergence, and its superiority over the original MDNRM was demonstrated through wAIC and LOO values.

摘要

目的

本研究旨在改进用于放射诊断学多类别分类的多维名义反应模型（MDNRM）。

材料与方法

这项回顾性研究涉及对传统名义反应模型（NRM）进行扩展，以创建双参数MDNRM（2PL-MDNRM）。采用七种MDNRM模型，包括原始MDNRM和2PL-MDNRM的亚型，来估计考生的能力和试题难度。这些模型应用于临床放射诊断数据集。计算Rhat值以评估模型收敛情况。此外，计算广泛适用信息准则（wAIC）值和帕累托平滑重要性抽样留一法交叉验证（LOO）值，以评估这七种模型的拟合优度。根据wAIC和LOO值选择表现最佳的模型。使用方向概率（PD）来评估一个估计参数是否存在显著差异。

结果

七个模型的所有估计参数的Rhat值均低于1.10，表明收敛稳定。使用截断正态分布的2PL-MDNRM和使用截断正态分布的2PL-MDNRM分别获得了最佳的wAIC和LOO值（分别为988和1121）。值得注意的是，根据最佳模型的PD结果，一名考生（放射科医生）表现出明显优于另一名考生的能力，而在非最优模型中未观察到显著差异。

结论

2PL-MDNRM成功实现了参数估计收敛，并且通过wAIC和LOO值证明了其优于原始MDNRM。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5828/11623241/bdb826e372ab/peerj-cs-10-2380-g001.jpg

相似文献

Optimizing statistical evaluation of multiclass classification in diagnostic radiology: a study of the two-parameter multidimensional nominal response model.优化诊断放射学中多类分类的统计评估：双参数多维名义响应模型的研究

PeerJ Comput Sci. 2024 Oct 4;10:e2380. doi: 10.7717/peerj-cs.2380. eCollection 2024.

Bayesian multidimensional nominal response model for observer study of radiologists.贝叶斯多维名义反应模型在放射科医师观察研究中的应用。

Jpn J Radiol. 2023 Apr;41(4):449-455. doi: 10.1007/s11604-022-01366-y. Epub 2022 Dec 5.

Model Selection for Mixed-Effects Location-Scale Models with Confidence Interval for LOO or WAIC Difference.具有留一法（LOO）或信息权重平均法（WAIC）差异置信区间的混合效应位置-尺度模型的模型选择

Multivariate Behav Res. 2025 Jul-Aug;60(4):678-694. doi: 10.1080/00273171.2025.2462033. Epub 2025 Feb 18.

Identifying the Best Approximating Model in Bayesian Phylogenetics: Bayes Factors, Cross-Validation or wAIC?贝叶斯系统发生学中最佳逼近模型的识别：贝叶斯因子、交叉验证还是 wAIC？

Syst Biol. 2023 Jun 17;72(3):616-638. doi: 10.1093/sysbio/syad004.

A Comparison of Monte Carlo Methods for Computing Marginal Likelihoods of Item Response Theory Models.项目反应理论模型边际似然计算的蒙特卡罗方法比较

J Korean Stat Soc. 2019 Dec;48(4):503-512. doi: 10.1016/j.jkss.2019.04.001. Epub 2019 May 17.

Using leave-one-out cross validation (LOO) in a multilevel regression and poststratification (MRP) workflow: A cautionary tale.使用多层回归和事后分层 (MRP) 工作流程中的留一法交叉验证 (LOO)：一个警示故事。

Stat Med. 2024 Feb 28;43(5):953-982. doi: 10.1002/sim.9964. Epub 2023 Dec 26.

Bayesian Dimensionality Assessment for the Multidimensional Nominal Response Model.多维名义响应模型的贝叶斯维度评估

Front Psychol. 2017 Jun 16;8:961. doi: 10.3389/fpsyg.2017.00961. eCollection 2017.

Parameter Estimation Accuracy of the Effort-Moderated Item Response Theory Model Under Multiple Assumption Violations.多重假设违背下努力调节项目反应理论模型的参数估计准确性

Educ Psychol Meas. 2021 Jun;81(3):569-594. doi: 10.1177/0013164420949896. Epub 2020 Sep 2.

A Note on the Conversion of Item Parameters Standard Errors.关于项目参数标准误转换的注释。

Multivariate Behav Res. 2019 Mar-Apr;54(2):307-321. doi: 10.1080/00273171.2018.1513829. Epub 2018 Dec 21.

Comparison between pystan and numpyro in Bayesian item response theory: evaluation of agreement of estimated latent parameters and sampling performance.贝叶斯项目反应理论中PyStan与NumPyro的比较：估计潜在参数的一致性评估与抽样性能

PeerJ Comput Sci. 2023 Oct 5;9:e1620. doi: 10.7717/peerj-cs.1620. eCollection 2023.

本文引用的文献

PeerJ Comput Sci. 2023 Oct 5;9:e1620. doi: 10.7717/peerj-cs.1620. eCollection 2023.

Stan: A Probabilistic Programming Language.斯坦：一种概率编程语言。

J Stat Softw. 2017;76. doi: 10.18637/jss.v076.i01. Epub 2017 Jan 11.

Bayesian multidimensional nominal response model for observer study of radiologists.贝叶斯多维名义反应模型在放射科医师观察研究中的应用。

Jpn J Radiol. 2023 Apr;41(4):449-455. doi: 10.1007/s11604-022-01366-y. Epub 2022 Dec 5.

Using Item-Response Theory to Improve Interpretation of the Trans Woman Voice Questionnaire.运用项目反应理论改进跨性别女性声音问卷的解读。

Laryngoscope. 2023 May;133(5):1197-1204. doi: 10.1002/lary.30360. Epub 2022 Aug 29.

Deep learning model for the automatic classification of COVID-19 pneumonia, non-COVID-19 pneumonia, and the healthy: a multi-center retrospective study.深度学习模型自动分类 COVID-19 肺炎、非 COVID-19 肺炎和健康人群：一项多中心回顾性研究。

Sci Rep. 2022 May 17;12(1):8214. doi: 10.1038/s41598-022-11990-3.

On weakly informative prior distributions for the heterogeneity parameter in Bayesian random-effects meta-analysis.关于贝叶斯随机效应荟萃分析中异质性参数的弱信息先验分布。

Res Synth Methods. 2021 Jul;12(4):448-474. doi: 10.1002/jrsm.1475. Epub 2021 Feb 15.

Estimation of Response Styles Using the Multidimensional Nominal Response Model: A Tutorial and Comparison With Sum Scores.使用多维名义反应模型估计反应风格：教程及与总分的比较

Front Psychol. 2020 Feb 6;11:72. doi: 10.3389/fpsyg.2020.00072. eCollection 2020.

Indices of Effect Existence and Significance in the Bayesian Framework.贝叶斯框架下效应存在性和显著性的指标

Front Psychol. 2019 Dec 10;10:2767. doi: 10.3389/fpsyg.2019.02767. eCollection 2019.

Bayesian Statistical Model of Item Response Theory in Observer Studies of Radiologists.观察者研究中放射科医生的项目反应理论的贝叶斯统计模型。

Acad Radiol. 2020 Mar;27(3):e45-e54. doi: 10.1016/j.acra.2019.04.014. Epub 2019 May 28.

Using the Stan Program for Bayesian Item Response Theory.使用斯坦程序进行贝叶斯项目反应理论分析。

Educ Psychol Meas. 2018 Jun;78(3):384-408. doi: 10.1177/0013164417693666. Epub 2017 Feb 1.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

优化诊断放射学中多类分类的统计评估：双参数多维名义响应模型的研究

Optimizing statistical evaluation of multiclass classification in diagnostic radiology: a study of the two-parameter multidimensional nominal response model.

作者信息

机构信息

出版信息

PURPOSE

MATERIALS AND METHODS

RESULTS

CONCLUSION

目的

材料与方法

结果

结论

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献