Suppr超能文献

深度强化学习在肺癌放射自适应中的应用。

Deep reinforcement learning for automated radiation adaptation in lung cancer.

机构信息

Department of Radiation Oncology, University of Michigan, Ann Arbor, MI, USA.

Department of Electrical and Computer Engineering, National Chiao Tung University, Hsinchu, Taiwan.

出版信息

Med Phys. 2017 Dec;44(12):6690-6705. doi: 10.1002/mp.12625. Epub 2017 Nov 14.

Abstract

PURPOSE

To investigate deep reinforcement learning (DRL) based on historical treatment plans for developing automated radiation adaptation protocols for nonsmall cell lung cancer (NSCLC) patients that aim to maximize tumor local control at reduced rates of radiation pneumonitis grade 2 (RP2).

METHODS

In a retrospective population of 114 NSCLC patients who received radiotherapy, a three-component neural networks framework was developed for deep reinforcement learning (DRL) of dose fractionation adaptation. Large-scale patient characteristics included clinical, genetic, and imaging radiomics features in addition to tumor and lung dosimetric variables. First, a generative adversarial network (GAN) was employed to learn patient population characteristics necessary for DRL training from a relatively limited sample size. Second, a radiotherapy artificial environment (RAE) was reconstructed by a deep neural network (DNN) utilizing both original and synthetic data (by GAN) to estimate the transition probabilities for adaptation of personalized radiotherapy patients' treatment courses. Third, a deep Q-network (DQN) was applied to the RAE for choosing the optimal dose in a response-adapted treatment setting. This multicomponent reinforcement learning approach was benchmarked against real clinical decisions that were applied in an adaptive dose escalation clinical protocol. In which, 34 patients were treated based on avid PET signal in the tumor and constrained by a 17.2% normal tissue complication probability (NTCP) limit for RP2. The uncomplicated cure probability (P+) was used as a baseline reward function in the DRL.

RESULTS

Taking our adaptive dose escalation protocol as a blueprint for the proposed DRL (GAN + RAE + DQN) architecture, we obtained an automated dose adaptation estimate for use at ∼2/3 of the way into the radiotherapy treatment course. By letting the DQN component freely control the estimated adaptive dose per fraction (ranging from 1-5 Gy), the DRL automatically favored dose escalation/de-escalation between 1.5 and 3.8 Gy, a range similar to that used in the clinical protocol. The same DQN yielded two patterns of dose escalation for the 34 test patients, but with different reward variants. First, using the baseline P+ reward function, individual adaptive fraction doses of the DQN had similar tendencies to the clinical data with an RMSE = 0.76 Gy; but adaptations suggested by the DQN were generally lower in magnitude (less aggressive). Second, by adjusting the P+ reward function with higher emphasis on mitigating local failure, better matching of doses between the DQN and the clinical protocol was achieved with an RMSE = 0.5 Gy. Moreover, the decisions selected by the DQN seemed to have better concordance with patients eventual outcomes. In comparison, the traditional temporal difference (TD) algorithm for reinforcement learning yielded an RMSE = 3.3 Gy due to numerical instabilities and lack of sufficient learning.

CONCLUSION

We demonstrated that automated dose adaptation by DRL is a feasible and a promising approach for achieving similar results to those chosen by clinicians. The process may require customization of the reward function if individual cases were to be considered. However, development of this framework into a fully credible autonomous system for clinical decision support would require further validation on larger multi-institutional datasets.

摘要

目的

研究基于历史治疗计划的深度强化学习(DRL),以开发用于非小细胞肺癌(NSCLC)患者的自动放射适应方案,旨在以较低的 2 级放射性肺炎(RP2)发生率来最大化肿瘤局部控制率。

方法

在 114 名接受放疗的 NSCLC 患者的回顾性人群中,开发了一个三组件神经网络框架,用于 DRL 的剂量分割适应。大规模的患者特征包括临床、遗传和影像学放射组学特征,以及肿瘤和肺剂量学变量。首先,使用生成对抗网络(GAN)从相对较小的样本量中学习 DRL 训练所需的患者人群特征。其次,通过深度神经网络(DNN)利用原始数据和合成数据(通过 GAN)重建放射治疗人工环境(RAE),以估计个性化放射治疗患者治疗过程适应的转移概率。第三,应用深度 Q 网络(DQN)在响应适应治疗环境中为最佳剂量选择。该多组件强化学习方法与实际临床决策进行了基准测试,这些决策应用于适应性剂量递增临床方案中。其中,34 名患者根据肿瘤中强烈的 PET 信号进行治疗,并受到 RP2 正常组织并发症概率(NTCP)限制为 17.2%的限制。无并发症治愈率(P+)作为 DRL 中的基线奖励函数。

结果

以我们的适应性剂量递增方案作为拟议的 DRL(GAN + RAE + DQN)架构的蓝图,我们获得了在放射治疗过程进行到大约 2/3 时使用的自动剂量适应估计值。通过让 DQN 组件自由控制每部分的估计自适应剂量(范围为 1-5Gy),DRL 自动在 1.5 和 3.8Gy 之间进行剂量递增/递减,范围与临床方案相似。同一 DQN 为 34 名测试患者产生了两种剂量递增模式,但具有不同的奖励变体。首先,使用基线 P+奖励函数,DQN 的个体自适应分数剂量与临床数据具有相似的趋势,均方根误差(RMSE)为 0.76Gy;但是 DQN 建议的适应性剂量通常较小(不那么激进)。其次,通过调整 P+奖励函数,更加注重减轻局部失败的风险,DQN 与临床方案之间的剂量匹配度更好,均方根误差(RMSE)为 0.5Gy。此外,DQN 选择的决策似乎与患者最终结果具有更好的一致性。相比之下,强化学习的传统时间差分(TD)算法由于数值不稳定性和学习不足,导致均方根误差(RMSE)为 3.3Gy。

结论

我们证明了 DRL 的自动剂量适应是一种可行且有前途的方法,可以达到临床医生选择的类似结果。如果要考虑个别病例,则可能需要定制奖励功能。然而,要将该框架开发成用于临床决策支持的完全可信的自主系统,还需要在更大的多机构数据集上进行进一步验证。

相似文献

5
Approximate Policy-Based Accelerated Deep Reinforcement Learning.基于近似策略的加速深度强化学习
IEEE Trans Neural Netw Learn Syst. 2020 Jun;31(6):1820-1830. doi: 10.1109/TNNLS.2019.2927227. Epub 2019 Aug 6.
6
Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning.受限深度Q学习逐步逼近普通Q学习。
Front Neurorobot. 2019 Dec 10;13:103. doi: 10.3389/fnbot.2019.00103. eCollection 2019.
7
Multisource Transfer Double DQN Based on Actor Learning.基于演员学习的多源转移双 DQN。
IEEE Trans Neural Netw Learn Syst. 2018 Jun;29(6):2227-2238. doi: 10.1109/TNNLS.2018.2806087.

引用本文的文献

8
Key technologies and challenges in online adaptive radiotherapy for lung cancer.肺癌在线自适应放射治疗中的关键技术与挑战
Chin Med J (Engl). 2025 Jul 5;138(13):1559-1567. doi: 10.1097/CM9.0000000000003299. Epub 2024 Sep 23.

本文引用的文献

1
Radiogenomics and radiotherapy response modeling.放射基因组学与放射治疗反应建模
Phys Med Biol. 2017 Aug 1;62(16):R179-R206. doi: 10.1088/1361-6560/aa7c55.
8
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验