Peeters Michael J, Cor M Kenneth, Boddu Sai Hs, Nesamony Jerry
University of Toledo College of Pharmacy & Pharmaceutical Sciences, Toledo, OH.
University of Alberta Faculty of Pharmacy & Pharmaceutical Sciences, Edmonton, AB.
Innov Pharm. 2021 Feb 26;12(1). doi: 10.24926/iip.v12i1.2925. eCollection 2021.
Reliability is critical validation evidence on which to base high-stakes decision-making. Many times, one exam in a didactic course may not be acceptably reliable on its own. But how much might multiple exams add when combined together?
To improve validation evidence towards high-stakes decision-making, Generalizability Theory (G-Theory) can combine reliabilities from multiple exams into one composite-reliability (G_String IV software). Further, G-Theory decision-studies can illustrate changes in course-grade reliability, depending on the number of exams and exam-items.
101 first-year PharmD students took two midterm-exams and one final-exam in a pharmaceutics course. Individually, Exam1 had 50MCQ (KR-20=0.69), Exam2 had 43MCQ (KR-20=0.65), and Exam3 had 67MCQ (KR-20=0.67). After combining exam occasions using G-Theory, the composite-reliability was 0.71 for overall course-grades-better than any exam alone. Remarkably, increased numbers of exam occasions showed fewer items per exam were needed, and fewer items over all exams, to obtain an acceptable composite-reliability. Acceptable reliability could be achieved with different combinations of number of MCQs on each exam and number of exam occasions.
G-Theory provided reliability critical validation evidence towards high-stakes decision-making. Final course-grades appeared quite reliable after combining multiple course exams-though this reliability could and should be improved. Notably, more exam occasions allowed fewer items per exam and fewer items over all the exams. Thus, one added benefit of more exam occasions for educators is developing fewer items per exam and fewer items over all exams.
可靠性是高风险决策所依据的关键验证证据。很多时候,一门教学课程中的一次考试本身可能可靠性不足。但多次考试组合在一起能增加多少可靠性呢?
为了改进高风险决策的验证证据,概化理论(G理论)可以将多次考试的可靠性合并为一个综合可靠性(G_String IV软件)。此外,G理论决策研究可以说明课程成绩可靠性的变化,这取决于考试次数和考试题目数量。
101名药学博士一年级学生在一门药剂学课程中参加了两次期中考试和一次期末考试。单独来看,考试1有50道多项选择题(KR-20=0.69),考试2有43道多项选择题(KR-20=0.65),考试3有67道多项选择题(KR-20=0.67)。使用G理论合并考试场次后,整个课程成绩的综合可靠性为0.71,优于任何一次单独考试。值得注意的是,考试场次增加时,每次考试所需的题目数量减少,所有考试的题目总数也减少,就能获得可接受的综合可靠性。通过每次考试多项选择题数量和考试场次的不同组合可以实现可接受的可靠性。
G理论为高风险决策提供了关键的可靠性验证证据。在合并多门课程考试后,最终课程成绩看起来相当可靠——不过这种可靠性可以而且应该得到提高。值得注意的是,更多的考试场次使得每次考试的题目数量减少,所有考试的题目总数也减少。因此,对教育工作者来说,增加考试场次的另一个好处是每次考试和所有考试的题目数量都减少。