Smith A E, Nugent C D, McClean S I
Medical Informatics, Faculty of Informatics, University of Ulster, Jordanstown, Newtownabbey, BT37 0QB, Northern Ireland, Antrim, UK.
Artif Intell Med. 2003 Jan;27(1):1-27. doi: 10.1016/s0933-3657(02)00088-x.
Researchers who design intelligent systems for medical decision support, are aware of the need for response to real clinical issues, in particular the need to address the specific ethical problems that the medical domain has in using black boxes. This means such intelligent systems have to be thoroughly evaluated, for acceptability. Attempts at compliance, however, are hampered by lack of guidelines. This paper addresses the issue of inherent performance evaluation, which researchers have addressed in part, but a Medline search, using neural networks as an example of intelligent systems, indicated that only about 12.5% evaluated inherent performance adequately. This paper aims to address this issue by concentrating on the possible evaluation methodology, giving a framework and specific suggestions for each type of classification problem. This should allow the developers of intelligent systems to produce evidence of a sufficiency of output performance evaluation.
设计用于医疗决策支持的智能系统的研究人员,意识到需要应对实际临床问题,特别是需要解决医疗领域在使用黑箱时所面临的特定伦理问题。这意味着此类智能系统必须进行全面评估,以确保可接受性。然而,由于缺乏指导方针,合规性方面的尝试受到了阻碍。本文探讨了内在性能评估问题,研究人员已部分涉及该问题,但以神经网络作为智能系统的示例进行的Medline搜索表明,只有约12.5%的研究对内在性能进行了充分评估。本文旨在通过专注于可能的评估方法来解决这一问题,为每种类型的分类问题提供一个框架和具体建议。这应使智能系统的开发者能够提供输出性能评估充分性的证据。