German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany.
German Cancer Research Center (DKFZ) Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany.
Nat Methods. 2024 Feb;21(2):195-212. doi: 10.1038/s41592-023-02151-z. Epub 2024 Feb 12.
Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. In biomedical image analysis, chosen performance metrics often do not reflect the domain interest, and thus fail to adequately measure scientific progress and hinder translation of ML techniques into practice. To overcome this, we created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Developed by a large international consortium in a multistage Delphi process, it is based on the novel concept of a problem fingerprint-a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), dataset and algorithm output. On the basis of the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as classification tasks at image, object or pixel level, namely image-level classification, object detection, semantic segmentation and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. Its applicability is demonstrated for various biomedical use cases.
越来越多的证据表明,机器学习 (ML) 算法验证中的缺陷是一个被低估的全球性问题。在生物医学图像分析中,所选的性能指标往往不能反映领域兴趣,因此无法充分衡量科学进展,并阻碍 ML 技术向实践的转化。为了克服这一问题,我们创建了 Metrics Reloaded,这是一个全面的框架,指导研究人员在问题意识的基础上选择指标。它由一个大型国际联盟在多阶段 Delphi 过程中开发,基于问题指纹的新概念,即给定问题的结构化表示,它捕获了与指标选择相关的所有方面,从领域兴趣到目标结构的属性、数据集和算法输出。基于问题指纹,用户可以在选择和应用适当的验证指标时得到指导,同时意识到潜在的陷阱。Metrics Reloaded 针对可以在图像、对象或像素级别解释为分类任务的图像分析问题,即图像级分类、对象检测、语义分割和实例分割任务。为了提高用户体验,我们在 Metrics Reloaded 在线工具中实现了该框架。随着 ML 方法在应用领域的融合,Metrics Reloaded 促进了验证方法的融合。它在各种生物医学用例中得到了应用。