Case S M, Swanson D B, Stillman P L
National Board of Medical Examiners, Philadelphia, PA 19104.
Res Med Educ. 1988;27:3-8.
The issues related to measuring pattern recognition are similar to measuring other clinical skills; how can testing time be used most efficiently to obtain reliable and valid scores? How should tests be constructed to obtain scores that validly reflect individual performance in making diagnoses? Generalizability analyses indicated that performance in one topic area does not predict performance in other areas very well. For example, students who were relatively expert in diagnosing patients with headaches tend not to be expert in diagnosing patients with chest pain, joint pain, etc. Therefore, to evaluate diagnostic pattern recognition skills in general, it is preferable to sample more presenting complaints with fewer items directed at each one, rather than to sample more items within a small number of presenting complaints. Approximately one hour of testing time would be required to generate a reasonably reliable score (ie, with a generalizability coefficient greater than 0.80). For diagnostic or remedial purposes, performance can be examined by content area to determine specific areas of weakness for individual students. The next phase of this study will be directed at determining whether there are benefits to using the current matching format with a relatively long list of response alternatives rather than a traditional multiple choice format with five choices. It is hypothesized that the shorter list differentially benefits the lower ability students and the more junior students. Efforts will also be directed to determining the applicability of the item format to other content areas such as diagnostic testing and therapy.
与测量模式识别相关的问题与测量其他临床技能类似;如何最有效地利用测试时间来获得可靠且有效的分数?测试应如何构建才能获得有效反映个体诊断表现的分数?概化分析表明,在一个主题领域的表现并不能很好地预测在其他领域的表现。例如,在诊断头痛患者方面相对专业的学生,在诊断胸痛、关节痛等患者时往往并不专业。因此,为了总体评估诊断模式识别技能,最好是对更多的主诉进行抽样,针对每个主诉设置较少的项目,而不是在少数主诉中设置更多项目。生成一个合理可靠的分数(即概化系数大于0.80)大约需要一小时的测试时间。出于诊断或补救目的,可以按内容领域检查表现,以确定个别学生的具体薄弱领域。本研究的下一阶段将致力于确定使用具有相对较长备选答案列表的当前匹配格式,而不是具有五个选项的传统多项选择格式是否有好处。据推测,较短的列表对能力较低的学生和低年级学生有不同程度的益处。还将努力确定该项目格式对其他内容领域(如诊断测试和治疗)的适用性。