Department of Radiology Technology, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
Department of Mathematics, University of Padova, Padova, Italy.
J Ultrasound Med. 2024 Oct;43(10):1789-1818. doi: 10.1002/jum.16524. Epub 2024 Jul 19.
Artificial intelligence (AI) models can play a more effective role in managing patients with the explosion of digital health records available in the healthcare industry. Machine-learning (ML) and deep-learning (DL) techniques are two methods used to develop predictive models that serve to improve the clinical processes in the healthcare industry. These models are also implemented in medical imaging machines to empower them with an intelligent decision system to aid physicians in their decisions and increase the efficiency of their routine clinical practices. The physicians who are going to work with these machines need to have an insight into what happens in the background of the implemented models and how they work. More importantly, they need to be able to interpret their predictions, assess their performance, and compare them to find the one with the best performance and fewer errors. This review aims to provide an accessible overview of key evaluation metrics for physicians without AI expertise. In this review, we developed four real-world diagnostic AI models (two ML and two DL models) for breast cancer diagnosis using ultrasound images. Then, 23 of the most commonly used evaluation metrics were reviewed uncomplicatedly for physicians. Finally, all metrics were calculated and used practically to interpret and evaluate the outputs of the models. Accessible explanations and practical applications empower physicians to effectively interpret, evaluate, and optimize AI models to ensure safety and efficacy when integrated into clinical practice.
人工智能 (AI) 模型可以在管理患者方面发挥更有效的作用,因为医疗保健行业中可利用的数字健康记录呈爆炸式增长。机器学习 (ML) 和深度学习 (DL) 技术是用于开发预测模型的两种方法,这些模型有助于改善医疗保健行业的临床流程。这些模型还被应用于医学成像设备中,为其配备智能决策系统,以帮助医生做出决策,并提高其日常临床实践的效率。将要使用这些机器的医生需要了解实施模型的背景中发生的情况以及它们的工作原理。更重要的是,他们需要能够解释他们的预测,评估他们的表现,并进行比较,以找到性能最好、错误最少的模型。本综述旨在为没有 AI 专业知识的医生提供关键评估指标的概述。在本综述中,我们使用超声图像开发了四个用于乳腺癌诊断的真实世界的诊断 AI 模型(两个 ML 和两个 DL 模型)。然后,我们简单回顾了 23 种最常用的评估指标,以供医生使用。最后,我们计算了所有指标,并实际用于解释和评估模型的输出。通俗易懂的解释和实际应用使医生能够有效地解释、评估和优化 AI 模型,以确保在将其集成到临床实践中时的安全性和有效性。