Miller P L
Comput Methods Programs Biomed. 1986 Mar;22(1):5-11.
This paper discusses the underlying issues in the evaluation of computer systems which apply artificial intelligence in medicine. Three different levels of evaluation are described: the subjective evaluation of the research contribution of a developmental prototype, the validation of a system's knowledge and performance, and the evaluation of the clinical efficacy of an operational system. The paper outlines a number of evaluation issues at each level, and discusses how previous artificial intelligence in medicine evaluations fit into this framework.
本文讨论了在医学中应用人工智能的计算机系统评估中的潜在问题。文中描述了三种不同层次的评估:对开发原型的研究贡献的主观评估、对系统知识和性能的验证以及对运行系统临床疗效的评估。本文概述了每个层次的一些评估问题,并讨论了以往医学人工智能评估如何适用于此框架。