Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium.
PReCISE, NaDI Institute, Faculty of Computer Science, UNamur and CENTAL, ILC, UCLouvain, Belgium.
Phys Med Biol. 2022 May 27;67(11). doi: 10.1088/1361-6560/ac678a.
The interest in machine learning (ML) has grown tremendously in recent years, partly due to the performance leap that occurred with new techniques of deep learning, convolutional neural networks for images, increased computational power, and wider availability of large datasets. Most fields of medicine follow that popular trend and, notably, radiation oncology is one of those that are at the forefront, with already a long tradition in using digital images and fully computerized workflows. ML models are driven by data, and in contrast with many statistical or physical models, they can be very large and complex, with countless generic parameters. This inevitably raises two questions, namely, the tight dependence between the models and the datasets that feed them, and the interpretability of the models, which scales with its complexity. Any problems in the data used to train the model will be later reflected in their performance. This, together with the low interpretability of ML models, makes their implementation into the clinical workflow particularly difficult. Building tools for risk assessment and quality assurance of ML models must involve then two main points: interpretability and data-model dependency. After a joint introduction of both radiation oncology and ML, this paper reviews the main risks and current solutions when applying the latter to workflows in the former. Risks associated with data and models, as well as their interaction, are detailed. Next, the core concepts of interpretability, explainability, and data-model dependency are formally defined and illustrated with examples. Afterwards, a broad discussion goes through key applications of ML in workflows of radiation oncology as well as vendors' perspectives for the clinical implementation of ML.
近年来,机器学习(ML)的兴趣大增,部分原因是深度学习新技术、图像卷积神经网络、计算能力的提高以及大型数据集的广泛可用性带来了性能飞跃。大多数医学领域都遵循这一流行趋势,尤其是放射肿瘤学就是其中之一,它具有使用数字图像和全计算机化工作流程的悠久传统。ML 模型由数据驱动,与许多统计或物理模型不同,它们可能非常大且复杂,具有无数通用参数。这不可避免地提出了两个问题,即模型与其所使用的数据集之间的紧密依赖关系,以及模型的可解释性,这与模型的复杂性成正比。用于训练模型的数据中的任何问题以后都会反映在模型的性能上。这一点,再加上 ML 模型的低可解释性,使得它们在临床工作流程中的实施变得特别困难。构建用于评估 ML 模型的风险和质量保证的工具必须涉及两个要点:可解释性和数据-模型依赖性。在对放射肿瘤学和 ML 进行联合介绍之后,本文回顾了将后者应用于前者工作流程时的主要风险和当前解决方案。详细介绍了与数据和模型相关的风险以及它们之间的相互作用。接下来,正式定义了可解释性、可解释性和数据-模型依赖性的核心概念,并通过示例进行了说明。然后,广泛讨论了 ML 在放射肿瘤学工作流程中的关键应用以及供应商对 ML 的临床实施的观点。