标准设定：两种方法的比较

Standard setting: comparison of two methods.

作者信息

George Sanju, Haque M Sayeed, Oyebode Femi

机构信息

Queen Elizabeth Psychiatric Hospital, Mindelsohn Way, Edgbaston, Birmingham, UK.

出版信息

BMC Med Educ. 2006 Sep 14;6:46. doi: 10.1186/1472-6920-6-46.

DOI:10.1186/1472-6920-6-46

PMID:16972990

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1578558/

Abstract

BACKGROUND

The outcome of assessments is determined by the standard-setting method used. There is a wide range of standard-setting methods and the two used most extensively in undergraduate medical education in the UK are the norm-reference and the criterion-reference methods. The aims of the study were to compare these two standard-setting methods for a multiple-choice question examination and to estimate the test-retest and inter-rater reliability of the modified Angoff method.

METHODS

The norm-reference method of standard-setting (mean minus 1 SD) was applied to the 'raw' scores of 78 4th-year medical students on a multiple-choice examination (MCQ). Two panels of raters also set the standard using the modified Angoff method for the same multiple-choice question paper on two occasions (6 months apart). We compared the pass/fail rates derived from the norm reference and the Angoff methods and also assessed the test-retest and inter-rater reliability of the modified Angoff method.

RESULTS

The pass rate with the norm-reference method was 85% (66/78) and that by the Angoff method was 100% (78 out of 78). The percentage agreement between Angoff method and norm-reference was 78% (95% CI 69% - 87%). The modified Angoff method had an inter-rater reliability of 0.81-0.82 and a test-retest reliability of 0.59-0.74.

CONCLUSION

There were significant differences in the outcomes of these two standard-setting methods, as shown by the difference in the proportion of candidates that passed and failed the assessment. The modified Angoff method was found to have good inter-rater reliability and moderate test-retest reliability.

摘要

背景

评估结果取决于所使用的标准设定方法。标准设定方法种类繁多，在英国本科医学教育中使用最广泛的两种方法是常模参照法和标准参照法。本研究的目的是比较这两种用于多项选择题考试的标准设定方法，并评估改良的安格夫方法的重测信度和评分者间信度。

方法

将常模参照标准设定方法（均值减去1个标准差）应用于78名四年级医学生在多项选择题考试（MCQ）中的“原始”分数。两组评分者也使用改良的安格夫方法，在两个不同时间（间隔6个月）对同一份多项选择题试卷设定标准。我们比较了常模参照法和安格夫法得出的及格/不及格率，并评估了改良安格夫方法的重测信度和评分者间信度。

结果

常模参照法的及格率为85%（66/78），安格夫法的及格率为100%（78/78）。安格夫法与常模参照法之间的百分比一致性为78%（95%可信区间69% - 87%）。改良的安格夫方法评分者间信度为0.81 - 0.82，重测信度为0.59 - 0.74。

结论

这两种标准设定方法的结果存在显著差异，表现为评估中及格和不及格考生比例的不同。改良的安格夫方法具有良好的评分者间信度和中等的重测信度。

相似文献

Standard setting: comparison of two methods.标准设定：两种方法的比较

BMC Med Educ. 2006 Sep 14;6:46. doi: 10.1186/1472-6920-6-46.

A comparison of different standard-setting methods for professional qualifying dental examination.不同标准化设定方法在专业资格牙科检查中的比较。

J Dent Educ. 2021 Jul;85(7):1210-1216. doi: 10.1002/jdd.12600. Epub 2021 Mar 31.

Who will pass the dental OSCE? Comparison of the Angoff and the borderline regression standard setting methods.谁将通过牙科客观结构化临床考试？安格夫法与边界回归标准设定方法的比较。

Eur J Dent Educ. 2009 Aug;13(3):162-71. doi: 10.1111/j.1600-0579.2008.00568.x.

Simulation-based examinations in physician assistant education: A comparison of two standard-setting methods.医师助理教育中基于模拟的考试：两种标准设定方法的比较

J Physician Assist Educ. 2010;21(2):7-14. doi: 10.1097/01367895-201021020-00002.

Comparison of two methods of standard setting: the performance of the three-level Angoff method.两种标准设定方法的比较：三级 Angoff 法的表现。

Med Educ. 2011 Dec;45(12):1199-208. doi: 10.1111/j.1365-2923.2011.04073.x.

Reliability and credibility of an angoff standard setting procedure in progress testing using recent graduates as judges.使用应届毕业生作为评判者的安格夫标准设定程序在进展测试中的可靠性和可信度。

Med Educ. 1999 Nov;33(11):832-7. doi: 10.1046/j.1365-2923.1999.00487.x.

Comparison of results between modified-Angoff and bookmark methods for estimating cut score of the Korean medical licensing examination.韩国医学执照考试及格分数估计中改良安格夫法与书签法结果的比较。

Korean J Med Educ. 2018 Dec;30(4):347-357. doi: 10.3946/kjme.2018.110. Epub 2018 Dec 1.

Is an Angoff standard an indication of minimal competence of examinees or of judges?安格夫标准是考生最低能力的指标还是评判者最低能力的指标？

Adv Health Sci Educ Theory Pract. 2008 May;13(2):203-11. doi: 10.1007/s10459-006-9035-1. Epub 2006 Oct 17.

How to set the bar in competency-based medical education: standard setting after an Objective Structured Clinical Examination (OSCE).如何在基于胜任力的医学教育中设定标准：客观结构化临床考试（OSCE）后的标准设定。

BMC Med Educ. 2016 Jan 4;16:1. doi: 10.1186/s12909-015-0506-z.

Standard setting of objective structured practical examination by modified Angoff method: A pilot study.采用改良安格夫法进行客观结构化实践考试的标准设定：一项试点研究。

Natl Med J India. 2016 May-Jun;29(3):160-162.

引用本文的文献

Situational judgment test in pharmacy education: assessing professionalism capability among students.药学教育中的情境判断测试：评估学生的专业素养能力

BMC Res Notes. 2025 Mar 26;18(1):128. doi: 10.1186/s13104-025-07183-6.

The Utility of Multiple-Choice Assessment in Current Medical Education: A Critical Review.当前医学教育中多项选择题评估的效用：一项批判性综述。

Cureus. 2024 May 7;16(5):e59778. doi: 10.7759/cureus.59778. eCollection 2024 May.

SAEM systematic online academic resource (SOAR) review: Gastrointestinal illnesses.SAEM系统在线学术资源（SOAR）综述：胃肠道疾病

AEM Educ Train. 2024 Mar 22;8(2):e10954. doi: 10.1002/aet2.10954. eCollection 2024 Apr.

Systematic online academic resource (SOAR) review: Pediatric respiratory infectious disease.系统性在线学术资源（SOAR）综述：儿科呼吸道传染病

AEM Educ Train. 2024 Feb 21;8(1):e10945. doi: 10.1002/aet2.10945. eCollection 2024 Feb.

Standard setting anchor statements: a double cross-over trial of two different methods.标准设定锚定陈述：两种不同方法的双交叉试验。

MedEdPublish (2016). 2021 Feb 3;10:32. doi: 10.15694/mep.2021.000032.1. eCollection 2021.

Standard Setting for the CGSO Qualifying Examination: A Structured Approach Setting a Meaningful Standard.CGSO资格考试的标准设定：一种设定有意义标准的结构化方法

Ann Surg Oncol. 2024 Jun;31(6):3569-3571. doi: 10.1245/s10434-024-15036-y. Epub 2024 Mar 13.

The European Examination in Core Cardiology in Focus: Evaluation and Recommendations Using Educational Theory.聚焦欧洲核心心脏病学考试：运用教育理论进行评估与建议

J Eur CME. 2022 Mar 24;11(1):2055266. doi: 10.1080/21614083.2022.2055266. eCollection 2022.

Are basic robotic surgical skills transferable from the simulator to the operating room? A randomized, prospective, educational study.基本的机器人手术技能能否从模拟器转移到手术室？一项随机、前瞻性的教育研究。

Can Urol Assoc J. 2020 Dec;14(12):416-422. doi: 10.5489/cuaj.6460.

Standard setting made easy: validating the Equal Z-score (EZ) method for setting cut-score for clinical examinations.标准设定变得简单：验证临床考试设定临界分数的等 Z 分数（EZ）法。

BMC Med Educ. 2020 May 25;20(1):167. doi: 10.1186/s12909-020-02080-x.

Systematic Online Academic Resource (SOAR) Review: Renal and Genitourinary.系统性在线学术资源（SOAR）综述：肾脏与泌尿生殖系统

AEM Educ Train. 2019 May 23;3(4):375-386. doi: 10.1002/aet2.10351. eCollection 2019 Oct.

本文引用的文献

Procedures for establishing defensible absolute passing scores on performance examinations in health professions education.在健康职业教育中为能力考核设定可靠绝对及格分数的程序。

Teach Learn Med. 2006 Winter;18(1):50-7. doi: 10.1207/s15328015tlm1801_11.

Comparison of two standard-setting methods for advanced cardiac life support training.两种高级心脏生命支持培训标准设定方法的比较。

Acad Med. 2005 Oct;80(10 Suppl):S63-6. doi: 10.1097/00001888-200510001-00018.

Standard setting for OSCEs: trial of borderline approach.客观结构化临床考试的标准设定：临界方法试验

Adv Health Sci Educ Theory Pract. 2004;9(3):201-9. doi: 10.1023/B:AHSE.0000038208.06099.9a.

A model for setting performance standards for standardized patient examinations.一种用于设定标准化患者检查绩效标准的模型。

Eval Health Prof. 2003 Dec;26(4):427-46. doi: 10.1177/0163278703258105.

Setting standards on educational tests.制定教育测试标准。

Med Educ. 2003 May;37(5):464-9. doi: 10.1046/j.1365-2923.2003.01495.x.

Comparison of a rational and an empirical standard setting procedure for an OSCE. Objective structured clinical examinations.客观结构化临床考试（OSCE）中合理与经验性标准设定程序的比较。客观结构化临床考试。

Med Educ. 2003 Feb;37(2):132-9. doi: 10.1046/j.1365-2923.2003.01429.x.

Panel expertise for an Angoff standard setting procedure in progress testing: item writers compared to recently graduated students.正在进行的测试中用于安格夫标准设定程序的专家小组：与刚毕业的学生相比的项目编写者。

Med Educ. 2002 Sep;36(9):860-7. doi: 10.1046/j.1365-2923.2002.01301.x.

Standard setting: a comparison of case-author and modified borderline-group methods in a small-scale OSCE.标准设定：在小规模客观结构化临床考试中病例作者法与改良边缘组法的比较

Acad Med. 2002 Jul;77(7):729-32. doi: 10.1097/00001888-200207000-00019.

Defining competency - the role of standard setting.界定能力——标准制定的作用。

Med Educ. 2000 May;34(5):363-6. doi: 10.1046/j.1365-2923.2000.00690.x.

A comparison of standard-setting procedures for an OSCE in undergraduate medical education.本科医学教育中客观结构化临床考试（OSCE）标准设定程序的比较

Acad Med. 2000 Mar;75(3):267-71. doi: 10.1097/00001888-200003000-00018.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验