使用深度强化学习来确定测试长度。

Using Deep Reinforcement Learning to Decide Test Length.

作者信息

Zoucha James, Himelfarb Igor, Tang Nai-En

机构信息

University of Northern Colorado, Greeley, CO, USA.

National Board of Chiropractic Examiners, Greeley, CO, USA.

出版信息

Educ Psychol Meas. 2025 May 3:00131644251332972. doi: 10.1177/00131644251332972.

DOI:10.1177/00131644251332972

PMID:40330328

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12049363/

Abstract

This study explored the application of deep reinforcement learning (DRL) as an innovative approach to optimize test length. The primary focus was to evaluate whether the current length of the National Board of Chiropractic Examiners Part I Exam is justified. By modeling the problem as a combinatorial optimization task within a Markov Decision Process framework, an algorithm capable of constructing test forms from a finite set of items while adhering to critical structural constraints, such as content representation and item difficulty distribution, was used. The findings reveal that although the DRL algorithm was successful in identifying shorter test forms that maintained comparable ability estimation accuracy, the existing test length of 240 items remains advisable as we found shorter test forms did not maintain structural constraints. Furthermore, the study highlighted the inherent adaptability of DRL to continuously learn about a test-taker's latent abilities and dynamically adjust to their response patterns, making it well-suited for personalized testing environments. This dynamic capability supports real-time decision-making in item selection, improving both efficiency and precision in ability estimation. Future research is encouraged to focus on expanding the item bank and leveraging advanced computational resources to enhance the algorithm's search capacity for shorter, structurally compliant test forms.

摘要

本研究探索了深度强化学习（DRL）作为一种创新方法在优化考试长度方面的应用。主要关注点是评估脊骨神经科医师资格考试第一部分当前的考试长度是否合理。通过将该问题建模为马尔可夫决策过程框架内的组合优化任务，使用了一种算法，该算法能够从有限的题目集中构建考试形式，同时遵守关键的结构约束，如内容呈现和题目难度分布。研究结果表明，尽管DRL算法成功识别出了能够保持相当能力估计准确性的较短考试形式，但由于我们发现较短的考试形式无法维持结构约束，现有的240道题目的考试长度仍然是可取的。此外，该研究强调了DRL固有的适应性，即能够持续了解考生的潜在能力并动态调整以适应他们的答题模式，使其非常适合个性化测试环境。这种动态能力支持在题目选择中进行实时决策，提高能力估计的效率和精度。鼓励未来的研究专注于扩大题目库并利用先进的计算资源，以增强算法搜索更短、结构合规的考试形式的能力。

相似文献

Using Deep Reinforcement Learning to Decide Test Length.使用深度强化学习来确定测试长度。

Educ Psychol Meas. 2025 May 3:00131644251332972. doi: 10.1177/00131644251332972.

DVNE-DRL: dynamic virtual network embedding algorithm based on deep reinforcement learning.DVNE-DRL：基于深度强化学习的动态虚拟网络嵌入算法

Sci Rep. 2023 Nov 13;13(1):19789. doi: 10.1038/s41598-023-47195-5.

Real-Time Online Goal Recognition in Continuous Domains via Deep Reinforcement Learning.

Entropy (Basel). 2023 Oct 4;25(10):1415. doi: 10.3390/e25101415.

Optimizing Robotic Task Sequencing and Trajectory Planning on the Basis of Deep Reinforcement Learning.基于深度强化学习优化机器人任务排序与轨迹规划

Biomimetics (Basel). 2023 Dec 27;9(1):10. doi: 10.3390/biomimetics9010010.

Automating the optimization of proton PBS treatment planning for head and neck cancers using policy gradient-based deep reinforcement learning.使用基于策略梯度的深度强化学习实现头颈癌质子笔形束扫描治疗计划的自动化优化。

Med Phys. 2025 Apr;52(4):1997-2014. doi: 10.1002/mp.17654. Epub 2025 Jan 31.

Explainable post hoc portfolio management financial policy of a Deep Reinforcement Learning agent.深度强化学习智能体的可解释事后投资组合管理财务策略。

PLoS One. 2025 Jan 16;20(1):e0315528. doi: 10.1371/journal.pone.0315528. eCollection 2025.

Speeding Task Allocation Search for Reconfigurations in Adaptive Distributed Embedded Systems Using Deep Reinforcement Learning.使用深度强化学习加速自适应分布式嵌入式系统中的重构任务分配搜索。

Sensors (Basel). 2023 Jan 3;23(1):548. doi: 10.3390/s23010548.

Traffic Signal Control Using Hybrid Action Space Deep Reinforcement Learning.基于混合动作空间的深度学习强化学习交通信号控制。

Sensors (Basel). 2021 Mar 25;21(7):2302. doi: 10.3390/s21072302.

Deep reinforcement learning and its applications in medical imaging and radiation therapy: a survey.深度强化学习及其在医学影像和放射治疗中的应用：综述。

Phys Med Biol. 2022 Nov 11;67(22). doi: 10.1088/1361-6560/ac9cb3.

Deep reinforcement learning for automated radiation adaptation in lung cancer.深度强化学习在肺癌放射自适应中的应用。

Med Phys. 2017 Dec;44(12):6690-6705. doi: 10.1002/mp.12625. Epub 2017 Nov 14.

本文引用的文献

An adaptive testing item selection strategy via a deep reinforcement learning approach.基于深度强化学习的自适应测验项目选择策略。

Behav Res Methods. 2024 Dec;56(8):8695-8714. doi: 10.3758/s13428-024-02498-x. Epub 2024 Sep 13.

An Ensemble Learning Approach Based on TabNet and Machine Learning Models for Cheating Detection in Educational Tests.一种基于TabNet和机器学习模型的集成学习方法用于教育考试中的作弊检测。

Educ Psychol Meas. 2024 Aug;84(4):780-809. doi: 10.1177/00131644231191298. Epub 2023 Aug 21.

Semisupervised Learning Method to Adjust Biased Item Difficulty Estimates Caused by Nonignorable Missingness in a Virtual Learning Environment.用于调整虚拟学习环境中由不可忽视的缺失值导致的有偏差项目难度估计的半监督学习方法。

Educ Psychol Meas. 2022 Jun;82(3):539-567. doi: 10.1177/00131644211020494. Epub 2021 Jun 4.

Examining the validity of chiropractic grade point averages for predicting National Board of Chiropractic Examiners Part I exam scores.检验脊椎按摩疗法平均绩点对预测美国脊椎按摩疗法考试委员会第一部分考试成绩的有效性。

J Chiropr Educ. 2022 Mar 1;36(1):1-12. doi: 10.7899/JCE-20-5.

Using Machine Learning Methods to Develop a Short Tree-Based Adaptive Classification Test: Case Study With a High-Dimensional Item Pool and Imbalanced Data.使用机器学习方法开发基于树的简短自适应分类测试：以高维项目池和不平衡数据为例的研究

Appl Psychol Meas. 2020 Oct;44(7-8):499-514. doi: 10.1177/0146621620931198. Epub 2020 Jun 18.

A Simple Model to Determine the Efficient Duration of Exams.一种确定考试有效时长的简单模型。

Educ Psychol Meas. 2021 Jun;81(3):549-568. doi: 10.1177/0013164420963163. Epub 2020 Oct 16.

A Comparison of Metaheuristic Optimization Algorithms for Scale Short-Form Development.用于量表简版开发的元启发式优化算法比较

Educ Psychol Meas. 2020 Oct;80(5):910-931. doi: 10.1177/0013164420906600. Epub 2020 Feb 17.

Score production and quantitative methods used by the National Board of Chiropractic Examiners for postexam analyses.脊椎按摩疗法考试委员会用于考后分析的分数生成与定量方法。

J Chiropr Educ. 2020 Mar;34(1):35-42. doi: 10.7899/JCE-18-27. Epub 2019 Jul 8.

ShortForm: An R Package to Select Scale Short Forms With the Ant Colony Optimization Algorithm.ShortForm：一个使用蚁群优化算法选择量表简表的R包。

Appl Psychol Meas. 2018 Sep;42(6):516-517. doi: 10.1177/0146621617752993. Epub 2018 Jan 22.

Using Genetic Algorithms in a Large Nationally Representative American Sample to Abbreviate the Multidimensional Experiential Avoidance Questionnaire.在美国一个具有全国代表性的大样本中使用遗传算法来简化多维经验回避问卷。

Front Psychol. 2016 Feb 24;7:189. doi: 10.3389/fpsyg.2016.00189. eCollection 2016.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验