Yang Dongrong, Wu Xin, Li Xinyi, Xie Yibo, Wu Qiuwen, Wu Q Jackie, Sheng Yang
Department of Radiation Oncology, Duke University Medical Center, Durham, NC 27710, United States of America.
Phys Med Biol. 2025 Apr 22;70(8). doi: 10.1088/1361-6560/adcb84.
Head-and-neck simultaneous integrated boost (SIB) treatment planning using intensity modulated radiation therapy is particularly challenging due to the proximity to organs-at-risk. Depending on the specific clinical conditions, different parotid-sparing strategies are utilized to preserve parotid function without compromising local tumor control. Clinically this is typically done with attending's directive or via trial-and-error comparison with different sparing tradeoffs. To streamline this process, we proposed a deep reinforcement learning (DRL)-based framework that automatically generates treatment plans with flexibility to adapt to clinical preferences.A preference-encoded DRL (PEDRL) framework was developed to self-interact with the clinical treatment planning system and dynamically adjust objective constraints in the inverse optimization space. It was powered by the discrete soft actor-critic algorithm with a multi-layer perceptron architecture. The agent interprets intermediate plan status and iteratively modifies objective constraint values in a human-like fashion. By encoding parotid-sparing preferences within the state space, the agent autonomously adapts the sparing strategy to achieve optimal plan quality based on clinical priorities. The agent was trained through iterative treatment plan generation using 40 cases and subsequently tested on additional 44 patients, with generated plans compared to clinical plans.The PEDRL-generated plans demonstrated comparable performance across all dosimetric evaluation metrics for both bilateral and unilateral sparing cases in the test set. For bilateral cases, the mean value of the parotid median dose was 18.82 Gy (left) and 19.61 Gy (right), compared to 19.31 Gy (left) and 19.12 Gy (right) in the clinical plans. In unilateral sparing cases, the mean value of the spared parotid median dose was 19.92 Gy in the PEDRL-generated plans, compared to 17.16 Gy in the clinical plansThe proposed novel automated treatment planning framework efficiently generates SIB treatment plans tailored to clinical preferences, demonstrating both effectiveness and adaptability.
由于靠近危及器官,使用调强放射治疗进行头颈部同步整合加量(SIB)治疗计划具有特别的挑战性。根据具体临床情况,采用不同的腮腺保留策略来保留腮腺功能,同时不影响局部肿瘤控制。临床上,这通常是根据主治医生的指示进行,或者通过与不同保留权衡的试错比较来完成。为了简化这一过程,我们提出了一个基于深度强化学习(DRL)的框架,该框架能够自动生成治疗计划,并灵活适应临床偏好。开发了一种偏好编码的DRL(PEDRL)框架,使其与临床治疗计划系统进行自我交互,并在逆优化空间中动态调整目标约束。它由具有多层感知器架构的离散软演员-评论家算法驱动。智能体解释中间计划状态,并以类似人类的方式迭代修改目标约束值。通过在状态空间中编码腮腺保留偏好,智能体根据临床优先级自主调整保留策略,以实现最佳计划质量。通过使用40个病例迭代生成治疗计划对智能体进行训练,随后在另外44名患者上进行测试,并将生成的计划与临床计划进行比较。在测试集中,PEDRL生成的计划在所有剂量学评估指标上,对于双侧和单侧保留病例均表现出可比的性能。对于双侧病例,腮腺中位剂量的平均值在PEDRL生成的计划中为18.82 Gy(左侧)和19.61 Gy(右侧),而在临床计划中为19.31 Gy(左侧)和19.12 Gy(右侧)。在单侧保留病例中,PEDRL生成的计划中保留腮腺中位剂量的平均值为19.92 Gy,而临床计划中为17.16 Gy。所提出的新型自动化治疗计划框架能够高效地生成符合临床偏好的SIB治疗计划,展现出有效性和适应性。