German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany.
German Cancer Research Center (DKFZ) Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany.
Nat Methods. 2024 Feb;21(2):182-194. doi: 10.1038/s41592-023-02150-0. Epub 2024 Feb 12.
Validation metrics are key for tracking scientific progress and bridging the current chasm between artificial intelligence research and its translation into practice. However, increasing evidence shows that, particularly in image analysis, metrics are often chosen inadequately. Although taking into account the individual strengths, weaknesses and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multistage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides a reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Although focused on biomedical image analysis, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. The work serves to enhance global comprehension of a key topic in image analysis validation.
验证指标对于跟踪科学进展和弥合人工智能研究与其向实际应用转化之间的当前鸿沟至关重要。然而,越来越多的证据表明,特别是在图像分析中,验证指标的选择往往不够恰当。尽管考虑到验证指标的个体优势、劣势和局限性是做出明智选择的关键前提,但相关知识目前分散且难以被个别研究人员获取。本工作基于多阶段德尔菲法(由多学科专家联盟进行)和广泛的社区反馈,为获取与图像分析中的验证指标相关的缺陷信息提供了一个可靠且全面的通用入口。虽然本工作重点关注生物医学图像分析,但所涉及的缺陷具有跨应用领域的普遍性,并根据新创建的、与领域无关的分类法进行了分类。本工作旨在增强对图像分析验证这一关键主题的全球理解。