• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大规模评估中不等组设计的手册效应建模

Modeling Booklet Effects for Nonequivalent Group Designs in Large-Scale Assessment.

作者信息

Hecht Martin, Weirich Sebastian, Siegle Thilo, Frey Andreas

机构信息

Humboldt-Universität zu Berlin, Berlin, Germany.

Friedrich Schiller University Jena, Jena, Germany.

出版信息

Educ Psychol Meas. 2015 Aug;75(4):568-584. doi: 10.1177/0013164414554219. Epub 2014 Nov 3.

DOI:10.1177/0013164414554219
PMID:29795833
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5965618/
Abstract

Multiple matrix designs are commonly used in large-scale assessments to distribute test items to students. These designs comprise several booklets, each containing a subset of the complete item pool. Besides reducing the test burden of individual students, using various booklets allows aligning the difficulty of the presented items to the assumed performance level of examined subgroups. While this may improve measurement precision and students' test-taking motivation, using several booklets might influence response behavior and thus constitute a potential source of unwanted variation. To provide guidance to identify and model booklet effects, this study presents statistical models accounting for booklet effects and applies these models in a large-scale assessment setting. Three models are derived from the Rasch model employing the generalized linear mixed models framework. The models were applied to data from a national educational standards assessment study for scientific competence. A total of 1,021 items were compiled to 74 booklets distributed to a sample of 9,044 students of Grades 9 and 10. The results revealed a small but nonnegligible booklet effect. For further large-scale assessment studies, it is recommended to examine whether booklet effects occur and to adequately account for them in the subsequent analyses where necessary.

摘要

在大规模评估中,多重矩阵设计通常用于向学生分发测试题目。这些设计包括几本小册子,每本包含完整题库的一个子集。除了减轻单个学生的测试负担外,使用不同的小册子还能使所呈现题目的难度与被测试子群体的假定表现水平相匹配。虽然这可能会提高测量精度和学生的应试动机,但使用几本小册子可能会影响答题行为,从而构成不必要变异的潜在来源。为了为识别和建模小册子效应提供指导,本研究提出了考虑小册子效应的统计模型,并将这些模型应用于大规模评估环境中。三个模型是从采用广义线性混合模型框架的拉施模型推导出来的。这些模型被应用于一项关于科学能力的国家教育标准评估研究的数据。总共1021道题目被编成74本小册子,分发给9044名九年级和十年级学生的样本。结果显示存在一个虽小但不可忽视的小册子效应。对于进一步的大规模评估研究,建议检查是否存在小册子效应,并在后续分析中必要时对其进行充分考虑。

相似文献

1
Modeling Booklet Effects for Nonequivalent Group Designs in Large-Scale Assessment.大规模评估中不等组设计的手册效应建模
Educ Psychol Meas. 2015 Aug;75(4):568-584. doi: 10.1177/0013164414554219. Epub 2014 Nov 3.
2
Effects of Design Properties on Parameter Estimation in Large-Scale Assessments.大规模评估中设计属性对参数估计的影响。
Educ Psychol Meas. 2015 Dec;75(6):1021-1044. doi: 10.1177/0013164415573311. Epub 2015 Mar 2.
3
Planning a Study for Testing the Rasch Model given Missing Values due to the use of Test-booklets.针对因使用测试手册导致缺失值的情况,规划一项用于检验拉施模型的研究。
J Appl Meas. 2015;16(4):432-42.
4
Assessing Validity of Measurement in Learning Disabilities Using Hierarchical Generalized Linear Modeling: The Roles of Anxiety and Motivation.使用分层广义线性模型评估学习障碍测量的有效性:焦虑和动机的作用。
Educ Psychol Meas. 2016 Aug;76(4):638-661. doi: 10.1177/0013164415604440. Epub 2015 Sep 17.
5
Assessment and statistical modeling of the relationship between remotely sensed aerosol optical depth and PM2.5 in the eastern United States.美国东部地区遥感气溶胶光学厚度与PM2.5之间关系的评估及统计建模
Res Rep Health Eff Inst. 2012 May(167):5-83; discussion 85-91.
6
Loss of Information in Estimating Item Parameters in Incomplete Designs.不完全设计中估计项目参数时的信息损失。
Psychometrika. 2006 Jun;71(2):303-322. doi: 10.1007/s11336-004-1205-6. Epub 2017 Feb 11.
7
Implications of Removing Random Guessing from Rasch Item Estimates in Vertical Scaling.垂直量表中从拉施项目估计中去除随机猜测的影响
J Appl Meas. 2015;16(2):113-28.
8
Red vs. green: Does the exam booklet color matter in higher education summative evaluations? Not likely.红色与绿色:在高等教育总结性评估中,考试手册的颜色重要吗?不太可能。
Psychon Bull Rev. 2016 Oct;23(5):1596-1601. doi: 10.3758/s13423-016-1009-6.
9
A purpose-based evaluation of information for patients: an approach to measuring effectiveness.针对患者的基于目的的信息评估:一种衡量有效性的方法。
Patient Educ Couns. 2007 Mar;65(3):311-9. doi: 10.1016/j.pec.2006.08.012. Epub 2006 Oct 2.
10
A randomized controlled trial comparing two educational booklets on prostate cancer.一项比较两本关于前列腺癌的教育手册的随机对照试验。
Can J Urol. 2006 Dec;13(6):3321-6.

引用本文的文献

1
A New Online Calibration Method Based on Lord's Bias-Correction.一种基于洛德偏差校正的新型在线校准方法。
Appl Psychol Meas. 2017 Sep;41(6):456-471. doi: 10.1177/0146621617697958. Epub 2017 Mar 26.