Wolfson Centre for Mathematical Biology, Mathematical Institute, Oxford, United Kingdom.
Integrated Mathematical Oncology, Moffitt Cancer Center, Tampa, Florida.
Cancer Res. 2024 Jun 4;84(11):1929-1941. doi: 10.1158/0008-5472.CAN-23-2040.
Standard-of-care treatment regimens have long been designed for maximal cell killing, yet these strategies often fail when applied to metastatic cancers due to the emergence of drug resistance. Adaptive treatment strategies have been developed as an alternative approach, dynamically adjusting treatment to suppress the growth of treatment-resistant populations and thereby delay, or even prevent, tumor progression. Promising clinical results in prostate cancer indicate the potential to optimize adaptive treatment protocols. Here, we applied deep reinforcement learning (DRL) to guide adaptive drug scheduling and demonstrated that these treatment schedules can outperform the current adaptive protocols in a mathematical model calibrated to prostate cancer dynamics, more than doubling the time to progression. The DRL strategies were robust to patient variability, including both tumor dynamics and clinical monitoring schedules. The DRL framework could produce interpretable, adaptive strategies based on a single tumor burden threshold, replicating and informing optimal treatment strategies. The DRL framework had no knowledge of the underlying mathematical tumor model, demonstrating the capability of DRL to help develop treatment strategies in novel or complex settings. Finally, a proposed five-step pathway, which combined mechanistic modeling with the DRL framework and integrated conventional tools to improve interpretability compared with traditional "black-box" DRL models, could allow translation of this approach to the clinic. Overall, the proposed framework generated personalized treatment schedules that consistently outperformed clinical standard-of-care protocols.
Generation of interpretable and personalized adaptive treatment schedules using a deep reinforcement framework that interacts with a virtual patient model overcomes the limitations of standardized strategies caused by heterogeneous treatment responses.
长期以来,标准护理治疗方案一直旨在实现最大的细胞杀伤,但由于耐药性的出现,这些策略在应用于转移性癌症时往往会失败。适应性治疗策略已被开发为一种替代方法,通过动态调整治疗方案来抑制耐药人群的生长,从而延迟甚至预防肿瘤进展。在前列腺癌中令人鼓舞的临床结果表明,有潜力优化适应性治疗方案。在这里,我们应用深度强化学习(DRL)来指导适应性药物安排,并证明这些治疗方案在经过前列腺癌动力学校准的数学模型中可以优于当前的适应性方案,使进展时间延长一倍以上。DRL 策略对患者变异性具有鲁棒性,包括肿瘤动力学和临床监测计划。DRL 框架可以根据单一肿瘤负担阈值生成可解释的适应性策略,复制并为最佳治疗策略提供信息。DRL 框架不了解潜在的数学肿瘤模型,这表明 DRL 有能力帮助在新的或复杂的环境中开发治疗策略。最后,提出了一个五步途径,该途径将机制建模与 DRL 框架相结合,并整合了常规工具,与传统的“黑盒”DRL 模型相比,提高了可解释性,可以将这种方法转化为临床应用。总体而言,所提出的框架生成了个性化的治疗方案,始终优于临床标准护理方案。
使用深度强化框架生成可解释和个性化的适应性治疗方案,该框架与虚拟患者模型相互作用,克服了由于治疗反应异质性而导致的标准化策略的局限性。