Long W J, Naimi S, Criscitiello M G
MIT Laboratory for Computer Science, Cambridge 02139, USA.
J Am Med Inform Assoc. 1994 Mar-Apr;1(2):127-41. doi: 10.1136/jamia.1994.95236144.
Evaluate the accuracy of the detailed diagnostic reasoning of the Heart Failure Program incorporating a new mechanism to handle temporal relationships and severity constraints.
Tools were developed to summarize diagnoses and automatically generate evaluation forms. Five expert cardiologists were asked to review the reasoning of the program, with two analyzing each case. Cases were gathered retrospectively for diversity and difficulty and 26 randomly selected cases were evaluated. The underlying issues were identified and classified.
Both reviewers rated the first diagnosis correct in 25% of the cases and at least one rated it wrong in 10%. Analyzing the detailed reasoning, 137 issues were raised, about 5.3 per case. Of these, 53% were possible concerns raised by one reviewer. Of the 5.3 issues per case, 2.5 were attributable to controversies, misunderstandings, or mistakes; 1 was due to the overly simplistic representation of the summaries; and 1.8 were issues related to the program.
Overall, the program is capable of providing high-quality detailed diagnostic hypotheses for complex cardiovascular cases. The results highlight several issues: 1) the difficulty of effectively summarizing hypotheses, 2) the nature of a physician's causal explanation, and 3) some problems in evaluating detailed diagnostic reasoning. The mistakes the program made imply that some additional refinement is needed but that the reasoning mechanisms developed can support the appropriate reasoning. The appropriate next step is a prospective evaluation addressing the program's usefulness.
评估心力衰竭程序中详细诊断推理的准确性,该程序纳入了一种处理时间关系和严重程度约束的新机制。
开发了用于总结诊断并自动生成评估表的工具。邀请了五位心脏病专家审查该程序的推理过程,每位专家分析两个病例。为保证多样性和难度,病例是回顾性收集的,共对26个随机选择的病例进行了评估。识别并分类了潜在问题。
两位审查者都认为第一个诊断在25%的病例中是正确的,10%的病例中至少有一位审查者认为诊断错误。分析详细推理过程,共提出了137个问题,每个病例约5.3个。其中,53%是一位审查者提出的可能存在的问题。每个病例的5.3个问题中,2.5个归因于争议、误解或错误;1个是由于总结过于简单化;1.8个是与程序相关的问题。
总体而言,该程序能够为复杂心血管病例提供高质量的详细诊断假设。结果突出了几个问题:1)有效总结假设的难度;2)医生因果解释的性质;3)评估详细诊断推理中的一些问题。该程序出现的错误意味着需要进一步完善,但所开发的推理机制能够支持适当的推理。下一步合适的做法是对该程序的实用性进行前瞻性评估。