Rossi-Mori A, Pisanelli D M, Ricci F L
Istituto Tecnologie Biomediche CNR, Rome, Italy.
Med Inform (Lond). 1990 Jul-Sep;15(3):191-204. doi: 10.3109/14639239009025267.
After the early experiments in artificial intelligence a methodology is emerging around advanced systems for the management of medical knowledge. The stress is moving away from the implementation of prototypes to the evaluation. It is possible to adapt and to apply this to field evaluation techniques already developed in similar contexts of knowledge management (books, drugs, epidemiology, consultants, etc.). The time is ready for a further step: to envisage a methodology for the design of real systems that cope with the 'knowledge environment' of the user. Every stage of the evaluation process is re-examined here, and considered as a framework to define goals and criteria about a step of design: (1) the impact of the system on the progress of health care provision (priorities, cost-benefit analysis, share of tasks among different media); (2) effectiveness in the end-user's environment and long-term effects on his behaviour (changes in people's role and responsibilities, improvements in the quality of data, acceptance of the system); (3) the intrinsic efficiency of the system apart from the operational context (correctness of the knowledge base, appropriateness of the reasoning). The need to differentiate the test sample into three classes (obvious, typical, atypical) is emphasized, discussing the influence on both evaluation and design. In particular the difficulty of having 'gold standards' on atypical cases, due to the disagreement among the experts, leads to the definition of two alternative attitudes: the 'standardization mode' and the 'brain-storming mode'.
在人工智能领域早期实验之后,一种围绕医学知识管理先进系统的方法正在形成。重点正从原型的实现转向评估。有可能对已在类似知识管理背景(书籍、药物、流行病学、顾问等)中开发的现场评估技术进行调整和应用。现在是迈出进一步步伐的时候了:设想一种设计真实系统的方法,使其能够应对用户的“知识环境”。这里重新审视了评估过程的每个阶段,并将其视为定义设计步骤目标和标准的框架:(1)系统对医疗保健提供进展的影响(优先级、成本效益分析、不同媒介之间任务的分担);(2)在最终用户环境中的有效性以及对其行为的长期影响(人们角色和责任的变化、数据质量的提高、系统的接受度);(3)系统在操作环境之外的内在效率(知识库的正确性、推理的适当性)。强调了将测试样本分为三类(明显、典型、非典型)的必要性,并讨论了其对评估和设计的影响。特别是由于专家之间存在分歧,在非典型案例中难以有“黄金标准”,这导致了两种替代态度的定义:“标准化模式”和“头脑风暴模式”。