Chen Yunsheng, Aleman Dionne M, Purdie Thomas G, McIntosh Chris
Department of Mechanical & Industrial Engineering, University of Toronto, Toronto, Ontario M5S 3G8, Canada.
Princess Margaret Cancer Centre, Radiation Medicine Program, Toronto, Ontario M5G 2C1, Canada.
Phys Med Biol. 2022 Jan 17;67(2). doi: 10.1088/1361-6560/ac3e0e.
The complexity of generating radiotherapy treatments demands a rigorous quality assurance (QA) process to ensure patient safety and to avoid clinically significant errors. Machine learning classifiers have been explored to augment the scope and efficiency of the traditional radiotherapy treatment planning QA process. However, one important gap in relying on classifiers for QA of radiotherapy treatment plans is the lack of understanding behind a specific classifier prediction. We develop explanation methods to understand the decisions of two automated QA classifiers: (1) a region of interest (ROI) segmentation/labeling classifier, and (2) a treatment plan acceptance classifier. For each classifier, a local interpretable model-agnostic explanation (LIME) framework and a novel adaption of team-based Shapley values framework are constructed. We test these methods in datasets for two radiotherapy treatment sites (prostate and breast), and demonstrate the importance of evaluating QA classifiers using interpretable machine learning approaches. We additionally develop a notion of explanation consistency to assess classifier performance. Our explanation method allows for easy visualization and human expert assessment of classifier decisions in radiotherapy QA. Notably, we find that our team-based Shapley approach is more consistent than LIME. The ability to explain and validate automated decision-making is critical in medical treatments. This analysis allows us to conclude that both QA classifiers are moderately trustworthy and can be used to confirm expert decisions, though the current QA classifiers should not be viewed as a replacement for the human QA process.
生成放射治疗方案的复杂性要求有一个严格的质量保证(QA)流程,以确保患者安全并避免出现具有临床意义的错误。人们已经探索使用机器学习分类器来扩大传统放射治疗计划QA流程的范围并提高其效率。然而,在依靠分类器进行放射治疗计划QA方面,一个重要的差距在于缺乏对特定分类器预测背后的理解。我们开发了解释方法来理解两个自动QA分类器的决策:(1)感兴趣区域(ROI)分割/标记分类器,以及(2)治疗计划验收分类器。对于每个分类器,构建了一个局部可解释的模型无关解释(LIME)框架和基于团队的Shapley值框架的一种新颖改编。我们在两个放射治疗部位(前列腺和乳腺)的数据集上测试了这些方法,并证明了使用可解释机器学习方法评估QA分类器的重要性。我们还提出了一种解释一致性的概念来评估分类器性能。我们的解释方法允许在放射治疗QA中轻松可视化和由人类专家评估分类器决策。值得注意的是,我们发现基于团队的Shapley方法比LIME更具一致性。在医疗治疗中,解释和验证自动决策的能力至关重要。该分析使我们能够得出结论,尽管当前的QA分类器不应被视为替代人工QA流程,但这两个QA分类器都具有一定的可信度,可用于确认专家决策。