• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于评估使用论证(AUA)验证框架的主要特质写作评分标准的结构效度。

Construct validity of primary trait writing rubrics based on assessment use argument (AUA) validation framework.

作者信息

Asli Nurul Fariena, Mohd Matore Mohd Effendi Ewan, Md Yunus Melor

机构信息

Faculty of Education, The National University of Malaysia, 43600, Bangi, Selangor, Malaysia.

出版信息

Heliyon. 2024 Nov 4;10(22):e40053. doi: 10.1016/j.heliyon.2024.e40053. eCollection 2024 Nov 30.

DOI:10.1016/j.heliyon.2024.e40053
PMID:39619579
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11605350/
Abstract

In assessing performance-based language assessment, the use of a valid and reliable scoring rubric is crucial to minimize measurement errors that become threats in the rating process. The validation process of rubrics which previously was based on qualitative data is not satisfying since empirical evidence is not present. Thus, drawing on the Assessment Use Argument (AUA) Validation Framework, this study aims to search for evidence to prove a claim which is Primary Trait Writing (PTW) rubrics for students self assessment activities are relevant to the construct being measured. Based on that, two warrants and one rebuttal were derived to prove the claim. The participants consisted of 149 secondary school students in a state in Malaysia and three facets identified in the study were the examinee (149 students), the rater (149 students and 3 teachers), and the trait (Content, Format, Cohesive Device, and Sentence Fluency). Many Facet Rasch Model was employed to search for evidence in approving the warrants and rejecting the rebuttal. Based on the statistical results, evidence has shown that PTW rubrics successfully discriminated between students' writing ability, and fulfilled the six basic conditions of rating scale effectiveness to a certain extent where the Cohesive Device trait became the primary concern. In addition, the fit statistics for all traits demonstrated internal consistency and the high-reliability index portrayed the criteria were well differentiated in terms of difficulty level. Thus, all evidence had shown that PTW rubrics obtained construct validity where the warrants were supported, and the rebuttal was rejected which led to the acceptance of the claim. The implication of this study highlights the importance of validating assessment rubrics to ensure their internal validity and the use of MRFM in providing comprehensive analyses of the strengths and weaknesses of the developed rubrics. The use of a primary trait type scoring rubric as a tool in students' self-assessment activities must be highlighted as more current studies are focusing on holistic and analytic scoring. Therefore, it is suggested for future research to expand the use of the Primary Trait rubrics in other type of essays and to be used in peer-assessment activities. Therefore, it is suggested for future research to expand the use of the Primary Trait rubrics in other types of essay and to be used in peer-assessment activities.

摘要

在评估基于表现的语言评估时,使用有效且可靠的评分标准对于最大限度地减少在评分过程中成为威胁的测量误差至关重要。以前基于定性数据的评分标准验证过程并不令人满意,因为缺乏实证证据。因此,本研究借鉴评估使用论证(AUA)验证框架,旨在寻找证据来证明一项主张,即用于学生自我评估活动的主要特征写作(PTW)评分标准与所测量的结构相关。基于此,得出了两个论据和一个反例来证明该主张。参与者包括马来西亚一个州的149名中学生,研究中确定的三个方面是考生(149名学生)、评分者(149名学生和3名教师)以及特征(内容、格式、衔接手段和句子流畅性)。采用多面Rasch模型来寻找证据以支持论据并反驳反例。基于统计结果,证据表明PTW评分标准成功地区分了学生的写作能力,并在一定程度上满足了评分量表有效性的六个基本条件,其中衔接手段特征成为主要关注点。此外,所有特征的拟合统计显示出内部一致性,高可靠性指数表明标准在难度水平方面有很好的区分度。因此,所有证据都表明PTW评分标准具有结构效度,论据得到支持,反例被驳回,从而该主张被接受。本研究的意义在于强调验证评估评分标准以确保其内部效度的重要性,以及使用MRFM对所开发评分标准的优缺点进行全面分析。必须强调将主要特征类型评分标准用作学生自我评估活动工具的重要性,因为目前更多的研究集中在整体评分和分析评分上。因此,建议未来的研究扩大主要特征评分标准在其他类型文章中的应用,并用于同伴评估活动。因此,建议未来的研究扩大主要特征评分标准在其他类型文章中的应用,并用于同伴评估活动。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aace/11605350/97e9f51657f8/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aace/11605350/4b5f5fe54462/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aace/11605350/39ceed0fd4f0/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aace/11605350/817699cde4e3/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aace/11605350/c1f05a2cdd0a/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aace/11605350/0bee4c018d0f/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aace/11605350/e4e57e1ee3f2/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aace/11605350/97e9f51657f8/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aace/11605350/4b5f5fe54462/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aace/11605350/39ceed0fd4f0/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aace/11605350/817699cde4e3/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aace/11605350/c1f05a2cdd0a/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aace/11605350/0bee4c018d0f/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aace/11605350/e4e57e1ee3f2/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aace/11605350/97e9f51657f8/gr7.jpg

相似文献

1
Construct validity of primary trait writing rubrics based on assessment use argument (AUA) validation framework.基于评估使用论证(AUA)验证框架的主要特质写作评分标准的结构效度。
Heliyon. 2024 Nov 4;10(22):e40053. doi: 10.1016/j.heliyon.2024.e40053. eCollection 2024 Nov 30.
2
The raters' differences in Arabic writing rubrics through the Many-Facet Rasch measurement model.通过多面Rasch测量模型分析评分者在阿拉伯语写作评分标准上的差异。
Front Psychol. 2022 Dec 16;13:988272. doi: 10.3389/fpsyg.2022.988272. eCollection 2022.
3
A Facets Analysis of Analytic vs. Holistic Scoring of Identical Short Constructed-Response Items: Different Outcomes and Their Implications for Scoring Rubric Development.相同简短建构反应题目的分析性评分与整体性评分的多面分析:不同结果及其对评分标准制定的启示
J Appl Meas. 2017;18(3):228-246.
4
Development of peer assessment rubrics in simulation-based learning for advanced cardiac life support skills among medical students.医学生高级心脏生命支持技能模拟学习中同伴评估量表的开发。
Adv Simul (Lond). 2024 Jun 24;9(1):25. doi: 10.1186/s41077-024-00301-7.
5
Development and Validation of a Tool to Evaluate the Evolution of Clinical Reasoning in Trauma Using Virtual Patients.开发并验证一种使用虚拟患者评估创伤临床推理演变的工具。
J Surg Educ. 2018 May-Jun;75(3):779-786. doi: 10.1016/j.jsurg.2017.08.024. Epub 2017 Sep 18.
6
Improving assessment of procedural skills in health sciences education: a validation study of a rubrics system in neurophysiotherapy.改进健康科学教育中程序性技能的评估:神经物理治疗学中一种等级评分系统的验证研究。
BMC Psychol. 2024 Mar 14;12(1):147. doi: 10.1186/s40359-024-01643-7.
7
Reliability and validity of simulation-based Electrocardiogram assessment rubrics for cardiac life support skills among medical students using generalizability theory.基于概化理论的医学生心脏生命支持技能模拟心电图评估量表的信度和效度
Med Educ Online. 2025 Dec;30(1):2479962. doi: 10.1080/10872981.2025.2479962. Epub 2025 Mar 23.
8
Validation of rubric-based evaluation for bachelor's theses in a food science and technology degree.基于纲要的食品科学与技术学位本科论文评估方法的验证。
J Food Sci. 2024 May;89(5):3129-3138. doi: 10.1111/1750-3841.17044. Epub 2024 Apr 5.
9
Assessing the competences associated with a nursing Bachelor thesis by means of rubrics.用评分表评估护理学士论文相关能力。
Nurse Educ Today. 2018 Jul;66:103-109. doi: 10.1016/j.nedt.2018.04.009. Epub 2018 Apr 17.
10
A Retrospective Cohort Analysis Comparing Analytic and Holistic Marking Rubrics in Medical Research Education.一项比较医学研究教育中分析性和整体性评分标准的回顾性队列分析。
J Med Educ Curric Dev. 2024 Aug 28;11:23821205241277337. doi: 10.1177/23821205241277337. eCollection 2024 Jan-Dec.

引用本文的文献

1
Academic Integrity Within the Medical Curriculum in the Age of Generative Artificial Intelligence.生成式人工智能时代医学课程中的学术诚信
Health Sci Rep. 2025 Feb 19;8(2):e70489. doi: 10.1002/hsr2.70489. eCollection 2025 Feb.

本文引用的文献

1
The raters' differences in Arabic writing rubrics through the Many-Facet Rasch measurement model.通过多面Rasch测量模型分析评分者在阿拉伯语写作评分标准上的差异。
Front Psychol. 2022 Dec 16;13:988272. doi: 10.3389/fpsyg.2022.988272. eCollection 2022.
2
Detecting and measuring rater effects using many-facet Rasch measurement: Part II.使用多面Rasch测量法检测和衡量评分者效应:第二部分。
J Appl Meas. 2004;5(2):189-227.
3
Detecting and measuring rater effects using many-facet Rasch measurement: part I.使用多面Rasch测量法检测和衡量评分者效应:第一部分。
J Appl Meas. 2003;4(4):386-422.
4
Optimizing rating scale category effectiveness.优化评定量表类别有效性。
J Appl Meas. 2002;3(1):85-106.