• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

贝叶斯项目反应理论中PyStan与NumPyro的比较:估计潜在参数的一致性评估与抽样性能

Comparison between pystan and numpyro in Bayesian item response theory: evaluation of agreement of estimated latent parameters and sampling performance.

作者信息

Nishio Mizuho, Ota Eiji, Matsuo Hidetoshi, Matsunaga Takaaki, Miyazaki Aki, Murakami Takamichi

机构信息

Department of Radiology, Kobe University Graduate School of Medicine, Kobe, Japan.

Futaba Numerical Technologies, Iruma, Japan.

出版信息

PeerJ Comput Sci. 2023 Oct 5;9:e1620. doi: 10.7717/peerj-cs.1620. eCollection 2023.

DOI:10.7717/peerj-cs.1620
PMID:37869462
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10588711/
Abstract

PURPOSE

The purpose of this study is to compare two libraries dedicated to the Markov chain Monte Carlo method: pystan and numpyro. In the comparison, we mainly focused on the agreement of estimated latent parameters and the performance of sampling using the Markov chain Monte Carlo method in Bayesian item response theory (IRT).

MATERIALS AND METHODS

Bayesian 1PL-IRT and 2PL-IRT were implemented with pystan and numpyro. Then, the Bayesian 1PL-IRT and 2PL-IRT were applied to two types of medical data obtained from a published article. The same prior distributions of latent parameters were used in both pystan and numpyro. Estimation results of latent parameters of 1PL-IRT and 2PL-IRT were compared between pystan and numpyro. Additionally, the computational cost of the Markov chain Monte Carlo method was compared between the two libraries. To evaluate the computational cost of IRT models, simulation data were generated from the medical data and numpyro.

RESULTS

For all the combinations of IRT types (1PL-IRT or 2PL-IRT) and medical data types, the mean and standard deviation of the estimated latent parameters were in good agreement between pystan and numpyro. In most cases, the sampling time using the Markov chain Monte Carlo method was shorter in numpyro than that in pystan. When the large-sized simulation data were used, numpyro with a graphics processing unit was useful for reducing the sampling time.

CONCLUSION

Numpyro and pystan were useful for applying the Bayesian 1PL-IRT and 2PL-IRT. Our results show that the two libraries yielded similar estimation result and that regarding to sampling time, the fastest libraries differed based on the dataset size.

摘要

目的

本研究旨在比较两个专门用于马尔可夫链蒙特卡罗方法的库:pystan和numpyro。在比较中,我们主要关注估计的潜在参数的一致性以及贝叶斯项目反应理论(IRT)中使用马尔可夫链蒙特卡罗方法的采样性能。

材料与方法

使用pystan和numpyro实现贝叶斯1PL-IRT和2PL-IRT。然后,将贝叶斯1PL-IRT和2PL-IRT应用于从一篇已发表文章中获得的两种类型的医学数据。pystan和numpyro都使用相同的潜在参数先验分布。比较了pystan和numpyro之间1PL-IRT和2PL-IRT潜在参数的估计结果。此外,还比较了两个库之间马尔可夫链蒙特卡罗方法的计算成本。为了评估IRT模型的计算成本,从医学数据和numpyro生成了模拟数据。

结果

对于IRT类型(1PL-IRT或2PL-IRT)和医学数据类型的所有组合,pystan和numpyro之间估计的潜在参数的均值和标准差一致性良好。在大多数情况下,numpyro中使用马尔可夫链蒙特卡罗方法的采样时间比pystan中的短。当使用大型模拟数据时,配备图形处理单元的numpyro有助于减少采样时间。

结论

Numpyro和pystan对于应用贝叶斯1PL-IRT和2PL-IRT很有用。我们的结果表明,这两个库产生了相似的估计结果,并且在采样时间方面,最快的库因数据集大小而异。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/e258d13847f3/peerj-cs-09-1620-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/7bb52024ba8a/peerj-cs-09-1620-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/8bc25c3c5d38/peerj-cs-09-1620-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/2ec5d87dcc3b/peerj-cs-09-1620-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/14b146980971/peerj-cs-09-1620-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/7c5c2a08dfb3/peerj-cs-09-1620-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/bf4db0ae3407/peerj-cs-09-1620-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/258f353633c8/peerj-cs-09-1620-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/04a40977f710/peerj-cs-09-1620-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/472a56958e1c/peerj-cs-09-1620-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/91ab2624981c/peerj-cs-09-1620-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/e258d13847f3/peerj-cs-09-1620-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/7bb52024ba8a/peerj-cs-09-1620-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/8bc25c3c5d38/peerj-cs-09-1620-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/2ec5d87dcc3b/peerj-cs-09-1620-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/14b146980971/peerj-cs-09-1620-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/7c5c2a08dfb3/peerj-cs-09-1620-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/bf4db0ae3407/peerj-cs-09-1620-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/258f353633c8/peerj-cs-09-1620-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/04a40977f710/peerj-cs-09-1620-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/472a56958e1c/peerj-cs-09-1620-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/91ab2624981c/peerj-cs-09-1620-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ae8/10588711/e258d13847f3/peerj-cs-09-1620-g011.jpg

相似文献

1
Comparison between pystan and numpyro in Bayesian item response theory: evaluation of agreement of estimated latent parameters and sampling performance.贝叶斯项目反应理论中PyStan与NumPyro的比较:估计潜在参数的一致性评估与抽样性能
PeerJ Comput Sci. 2023 Oct 5;9:e1620. doi: 10.7717/peerj-cs.1620. eCollection 2023.
2
A Comparison of Monte Carlo Methods for Computing Marginal Likelihoods of Item Response Theory Models.项目反应理论模型边际似然计算的蒙特卡罗方法比较
J Korean Stat Soc. 2019 Dec;48(4):503-512. doi: 10.1016/j.jkss.2019.04.001. Epub 2019 May 17.
3
A Dominance Variant Under the Multi-Unidimensional Pairwise-Preference Framework: Model Formulation and Markov Chain Monte Carlo Estimation.多维度成对偏好框架下的一个显性变异:模型构建与马尔可夫链蒙特卡罗估计
Appl Psychol Meas. 2016 Oct;40(7):500-516. doi: 10.1177/0146621616662226. Epub 2016 Aug 20.
4
Bayesian Modal Estimation for the One-Parameter Logistic Ability-Based Guessing (1PL-AG) Model.基于单参数逻辑斯蒂能力猜测(1PL-AG)模型的贝叶斯模态估计
Appl Psychol Meas. 2021 May;45(3):195-213. doi: 10.1177/0146621621990761. Epub 2021 Feb 8.
5
A Hierarchical Multi-Unidimensional IRT Approach for Analyzing Sparse, Multi-Group Data for Integrative Data Analysis.一种用于综合数据分析的稀疏多组数据分层多单维项目反应理论方法。
Psychometrika. 2015 Sep;80(3):834-55. doi: 10.1007/s11336-014-9420-2. Epub 2014 Sep 30.
6
Rasch Model Parameter Estimation in the Presence of a Nonnormal Latent Trait Using a Nonparametric Bayesian Approach.使用非参数贝叶斯方法在存在非正态潜在特质的情况下进行拉施模型参数估计。
Educ Psychol Meas. 2016 Aug;76(4):662-684. doi: 10.1177/0013164415608418. Epub 2015 Oct 12.
7
A comparison of computational algorithms for the Bayesian analysis of clinical trials.临床试验贝叶斯分析的计算算法比较。
Clin Trials. 2024 Dec;21(6):689-700. doi: 10.1177/17407745241247334. Epub 2024 May 16.
8
A Bayesian Random Block Item Response Theory Model for Forced-Choice Formats.一种用于强制选择格式的贝叶斯随机块项目反应理论模型。
Educ Psychol Meas. 2020 Jun;80(3):578-603. doi: 10.1177/0013164419871659. Epub 2019 Aug 27.
9
Bayesian Inference for IRT Models with Non-Normal Latent Trait Distributions.贝叶斯推断在非正态潜在特质分布下的IRT 模型。
Multivariate Behav Res. 2021 Sep-Oct;56(5):703-723. doi: 10.1080/00273171.2020.1776096. Epub 2020 Jun 29.
10
Sample Size Requirements for Applying Mixed Polytomous Item Response Models: Results of a Monte Carlo Simulation Study.应用混合多分类项目反应模型的样本量要求:蒙特卡罗模拟研究结果
Front Psychol. 2019 Nov 13;10:2494. doi: 10.3389/fpsyg.2019.02494. eCollection 2019.

引用本文的文献

1
Optimizing statistical evaluation of multiclass classification in diagnostic radiology: a study of the two-parameter multidimensional nominal response model.优化诊断放射学中多类分类的统计评估:双参数多维名义响应模型的研究
PeerJ Comput Sci. 2024 Oct 4;10:e2380. doi: 10.7717/peerj-cs.2380. eCollection 2024.

本文引用的文献

1
Stan: A Probabilistic Programming Language.斯坦:一种概率编程语言。
J Stat Softw. 2017;76. doi: 10.18637/jss.v076.i01. Epub 2017 Jan 11.
2
Bayesian multidimensional nominal response model for observer study of radiologists.贝叶斯多维名义反应模型在放射科医师观察研究中的应用。
Jpn J Radiol. 2023 Apr;41(4):449-455. doi: 10.1007/s11604-022-01366-y. Epub 2022 Dec 5.
3
Deep learning model for predicting gestational age after the first trimester using fetal MRI.利用胎儿 MRI 预测孕早期后的孕周的深度学习模型。
Eur Radiol. 2021 Jun;31(6):3775-3782. doi: 10.1007/s00330-021-07915-9. Epub 2021 Apr 14.
4
Bayesian Statistical Model of Item Response Theory in Observer Studies of Radiologists.观察者研究中放射科医生的项目反应理论的贝叶斯统计模型。
Acad Radiol. 2020 Mar;27(3):e45-e54. doi: 10.1016/j.acra.2019.04.014. Epub 2019 May 28.
5
Using the Stan Program for Bayesian Item Response Theory.使用斯坦程序进行贝叶斯项目反应理论分析。
Educ Psychol Meas. 2018 Jun;78(3):384-408. doi: 10.1177/0013164417693666. Epub 2017 Feb 1.
6
Emphysema Quantification Using Ultralow-Dose CT With Iterative Reconstruction and Filtered Back Projection.使用迭代重建和滤波反投影的超低剂量CT进行肺气肿定量分析
AJR Am J Roentgenol. 2016 Jun;206(6):1184-92. doi: 10.2214/AJR.15.15684. Epub 2016 Apr 8.
7
Overview of classical test theory and item response theory for the quantitative assessment of items in developing patient-reported outcomes measures.开发患者报告结局指标时用于项目定量评估的经典测试理论和项目反应理论概述。
Clin Ther. 2014 May;36(5):648-62. doi: 10.1016/j.clinthera.2014.04.006. Epub 2014 May 5.
8
Neuro-QOL: quality of life item banks for adults with neurological disorders: item development and calibrations based upon clinical and general population testing.神经生活质量量表(Neuro-QOL):用于神经障碍成人的生活质量量表条目库:基于临床和一般人群测试的条目开发和标定。
Qual Life Res. 2012 Apr;21(3):475-86. doi: 10.1007/s11136-011-9958-8. Epub 2011 Aug 27.
9
Efficiency of static and computer adaptive short forms compared to full-length measures of depressive symptoms.与抑郁症状的完整测量相比,静态和计算机自适应简短形式的效率。
Qual Life Res. 2010 Feb;19(1):125-36. doi: 10.1007/s11136-009-9560-5. Epub 2009 Nov 26.
10
Item response theory and health outcomes measurement in the 21st century.项目反应理论与21世纪的健康结果测量
Med Care. 2000 Sep;38(9 Suppl):II28-42. doi: 10.1097/00005650-200009002-00007.