Hulsman R L, Mollema E D, Oort F J, Hoos A M, de Haes J C J M
Department of Medical Psychology, J4, Academic Medical Centre, P.O. Box 22660, 1100 DD Amsterdam, The Netherlands.
Patient Educ Couns. 2006 Jan;60(1):24-31. doi: 10.1016/j.pec.2004.11.010. Epub 2004 Dec 30.
Using standardized video cases in a computerized objective structured video examination (OSVE) aims to measure cognitive scripts underlying overt communication behavior by questions on knowledge, understanding and performance. In this study the reliability of the OSVE assessment is analyzed using the generalizability theory.
Third year undergraduate medical students from the Academic Medical Center of the University of Amsterdam answered short-essay questions on three video cases, respectively about history taking, breaking bad news, and decision making. Of 200 participants, 116 completed all three video cases. Students were assessed in three shifts, each using a set of parallel case editions. About half of all available exams were scored independently by two raters using a detailed rating manual derived from the other half. Analyzed were the reliability of the assessment, the inter-rater reliability, and interrelatedness of the three types of video cases and their parallel editions, by computing a generalizability coefficient G.
The test score showed a normal distribution. The students performed relatively well on the history taking type of video cases, and relatively poor on decision making and did relatively poor on the understanding ('knows why/when') type of questions. The reliability of the assessment was acceptable (G = 0.66). It can be improved by including up to seven cases in the OSVE. The inter-rater reliability was very good (G = 0.93). The parallel editions of the video cases appeared to be more alike (G = 0.60) than the three case types (G = 0.47).
The additional value of an OSVE is the differential picture that is obtained about covert cognitive scripts underlying overt communication behavior in different types of consultations, indicated by the differing levels of knowledge, understanding and performance. The validation of the OSVE score requires more research.
A computerized OSVE has been successfully applied with third year undergraduate medical students. The test score meets psychometric criteria, enabling a proper discrimination between adequately and poorly performing students. The high inter-rater reliability indicates that a single rater is permitted.
在计算机化客观结构化视频考试(OSVE)中使用标准化视频案例,旨在通过关于知识、理解和表现的问题来衡量公开交流行为背后的认知脚本。在本研究中,使用概化理论分析OSVE评估的可靠性。
来自阿姆斯特丹大学学术医疗中心的三年级本科医学生回答了关于三个视频案例的短文问题,分别涉及病史采集、告知坏消息和决策制定。在200名参与者中,116人完成了所有三个视频案例。学生们分三个班次进行评估,每个班次使用一组平行的案例版本。大约一半的可用考试由两名评分者使用从另一半衍生而来的详细评分手册独立评分。通过计算概化系数G,分析了评估的可靠性、评分者间的可靠性以及三种类型视频案例及其平行版本的相关性。
考试成绩呈正态分布。学生们在病史采集类型的视频案例上表现相对较好,在决策制定方面表现相对较差,在理解(“知道原因/时间”)类型的问题上表现也相对较差。评估的可靠性是可以接受的(G = 0.66)。通过在OSVE中纳入多达七个案例可以提高可靠性。评分者间的可靠性非常好(G = 0.93)。视频案例的平行版本似乎比三种案例类型更相似(G = 0.60)(G = 0.47)。
OSVE的附加价值在于,通过不同类型会诊中公开交流行为背后隐蔽认知脚本的不同知识、理解和表现水平,获得了差异情况。OSVE分数的验证需要更多研究。
计算机化的OSVE已成功应用于三年级本科医学生。考试成绩符合心理测量标准,能够对表现良好和表现不佳的学生进行适当区分。评分者间的高可靠性表明允许单个评分者进行评分。