Tweed Mike, Wilkinson Tim
Medical Education Unit, University of Otago, Wellington, New Zealand.
Clin Teach. 2012 Oct;9(5):299-303. doi: 10.1111/j.1743-498X.2012.00567.x.
Clinicians are familiar with making diagnostic decisions based on information gathered from history, clinical examination and diagnostic tests. Although many clinicians assess students, they may be less familiar with ways to assimilate assessment information to inform educational decisions. We draw parallels between the processes used to make a clinical diagnosis and the similar processes needed to make an educational decision.
There are several indices that describe the performance and utility of diagnostic tests, which we have extrapolated to educational assessment.
We provide a clinical diagnostic question and an education assessment question, and use examples of indices of performance and utility for both of these situations to explore: reliability, indeterminate results, certainty in decisions, acceptable levels of sensitivity and specificity, pre-test probability and dealing with limitations. Test reliability requires adequate sampling and consistency between observers. Seeking more information should be targeted to situations where decisions are not certain. Altering score cut-points alters test sensitivity and specificity, which in assessment will alter the numbers of falsely passing or falsely failing candidates. Just as the pre-test probability of a diagnosis influences how to interpret diagnostic tests, so too does the pre-test probability of failure alter the performance characteristics of assessments. In clinical situations, a 'wait and see' approach may be limited by clinical urgency. Likewise, in assessment the 'wait and see' approach may be limited by a duty to society.
Clinicians familiar with the performance and utility of diagnostic tests can extrapolate that knowledge to make better interpretations of educational assessments.
临床医生熟悉基于从病史、临床检查和诊断测试中收集的信息做出诊断决策。尽管许多临床医生会对学生进行评估,但他们可能不太熟悉如何整合评估信息以指导教育决策。我们将临床诊断过程与做出教育决策所需的类似过程进行了类比。
有几个指标描述了诊断测试的性能和效用,我们将其外推到教育评估中。
我们提供了一个临床诊断问题和一个教育评估问题,并使用这两种情况的性能和效用指标示例来探讨:可靠性、不确定结果、决策的确定性、可接受的敏感性和特异性水平、预测试概率以及应对局限性。测试可靠性需要足够的样本量和观察者之间的一致性。寻求更多信息应针对决策不确定的情况。改变分数切点会改变测试的敏感性和特异性,在评估中这将改变误通过或误不及格考生的数量。正如诊断的预测试概率会影响如何解释诊断测试一样,不及格的预测试概率也会改变评估的性能特征。在临床情况下,“观望”方法可能会受到临床紧迫性的限制。同样,在评估中,“观望”方法可能会受到对社会的责任的限制。
熟悉诊断测试性能和效用的临床医生可以将这些知识外推,以便更好地解释教育评估。