School of Data Science, City University of Hong Kong, Hong Kong SAR, China.
Department of Thoracic Oncology, Tongji Hospital, Huazhong University of Science and Technology, 430030, Wuhan, China.
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae071.
The evolution of drug resistance leads to treatment failure and tumor progression. Intermittent androgen deprivation therapy (IADT) helps responsive cancer cells compete with resistant cancer cells in intratumoral competition. However, conventional IADT is population-based, ignoring the heterogeneity of patients and cancer. Additionally, existing IADT relies on pre-determined thresholds of prostate-specific antigen to pause and resume treatment, which is not optimized for individual patients. To address these challenges, we framed a data-driven method in two steps. First, we developed a time-varied, mixed-effect and generative Lotka-Volterra (tM-GLV) model to account for the heterogeneity of the evolution mechanism and the pharmacokinetics of two ADT drugs Cyproterone acetate and Leuprolide acetate for individual patients. Then, we proposed a reinforcement-learning-enabled individualized IADT framework, namely, I$^{2}$ADT, to learn the patient-specific tumor dynamics and derive the optimal drug administration policy. Experiments with clinical trial data demonstrated that the proposed I$^{2}$ADT can significantly prolong the time to progression of prostate cancer patients with reduced cumulative drug dosage. We further validated the efficacy of the proposed methods with a recent pilot clinical trial data. Moreover, the adaptability of I$^{2}$ADT makes it a promising tool for other cancers with the availability of clinical data, where treatment regimens might need to be individualized based on patient characteristics and disease dynamics. Our research elucidates the application of deep reinforcement learning to identify personalized adaptive cancer therapy.
耐药性的进化导致治疗失败和肿瘤进展。间歇性雄激素剥夺疗法(IADT)有助于有反应的癌细胞在肿瘤内竞争中与耐药癌细胞竞争。然而,传统的 IADT 是基于人群的,忽略了患者和癌症的异质性。此外,现有的 IADT 依赖于前列腺特异性抗原的预先确定阈值来暂停和恢复治疗,这不能针对个体患者进行优化。为了解决这些挑战,我们分两步提出了一种数据驱动的方法。首先,我们开发了一个时变的、混合效应和生成的 Lotka-Volterra(tM-GLV)模型,以解释个体患者的进化机制和两种 ADT 药物醋酸环丙孕酮和醋酸亮丙瑞林的药代动力学的异质性。然后,我们提出了一种强化学习驱动的个体化 IADT 框架,即 I$^{2}$ADT,以学习患者特定的肿瘤动力学并得出最佳药物管理策略。使用临床试验数据的实验表明,所提出的 I$^{2}$ADT 可以显著延长前列腺癌患者的无进展时间,同时减少累积药物剂量。我们还使用最近的一项试点临床试验数据验证了所提出方法的有效性。此外,I$^{2}$ADT 的适应性使其成为其他具有临床数据的癌症的有前途的工具,在这些癌症中,可能需要根据患者特征和疾病动态对治疗方案进行个体化。我们的研究阐明了深度学习强化学习在识别个性化适应性癌症治疗中的应用。