Maisiak R S, Berner E S
University of Alabama at Birmingham, Birmingham, AL, USA.
Proc AMIA Symp. 2000:532-6.
Little has been done to examine the relative merit of measures used to assess the impact of diagnostic decision support systems (DDSS) on physician performance. In this study, 10 different single-measures of diagnostic performance were compared empirically. The measures were of three types: rank-order, all-or-none, and appropriateness. The responsiveness (RESP) of each measure was estimated under two repeated-measures experimental conditions. RESP is the degree to which a measure could detect differences between conditions of low and high performance. The diagnostic performance of 108 physicians was compared on medical cases of varying diagnostic difficulty and with or without a high level of assistance from a DDSS. The results showed that the RESP among the measures varied nearly tenfold. The rank-order measures tended to provide the highest RESP values (maximum = 2.14) while appropriateness measures provided the lowest RESP values (maximum = 1.41). The most responsive measures were rank-orders of the correct diagnosis within the top 5 to 10 listed diagnoses.
在评估诊断决策支持系统(DDSS)对医生表现的影响时,很少有人对所使用的评估方法的相对优点进行研究。在本研究中,对10种不同的诊断性能单一评估方法进行了实证比较。这些评估方法分为三种类型:排序法、全或无法、以及适宜性法。在两种重复测量实验条件下,估计了每种评估方法的反应性(RESP)。RESP是指一种评估方法能够检测出低绩效和高绩效条件之间差异的程度。在不同诊断难度的医疗病例中,以及在有或没有DDSS高水平协助的情况下,对108名医生的诊断性能进行了比较。结果表明,这些评估方法之间的RESP差异近十倍。排序法往往提供最高的RESP值(最大值 = 2.14),而适宜性法提供最低的RESP值(最大值 = 1.41)。反应性最高的评估方法是在列出的前5至10个诊断中正确诊断的排序。