Alshankati K, Alshibany A, Toma A, Lajkosz K, Haibe-Kains B, Siu L L
Division of Medical Oncology and Hematology, Princess Margaret Cancer Centre, University Health Network, Toronto, Canada.
Medical Biophysics, University of Toronto, Toronto, Canada; Vector Institute for Artificial Intelligence, Toronto, Canada.
ESMO Open. 2025 Jul 14;10(8):105509. doi: 10.1016/j.esmoop.2025.105509.
BACKGROUND: Depth of tumor response (DepOR) of individual patients, as visualized by waterfall plots, is a short-term endpoint that may represent a surrogate for survival-based outcomes such as progression-free survival (PFS) and overall survival (OS). We hypothesized that PFS/OS could be predicted from waterfall plots in randomized clinical trials (RCTs) using a novel machine-learning (ML) computational model. MATERIALS AND METHODS: A literature-based search was carried out for phase II/III RCTs testing noncytotoxic systemic therapy, which included waterfall plots with corresponding PFS/OS results. Studies were defined as positive or negative based on achievement of an a priori-stated primary endpoint. Trial data and images of waterfall plots were manually extracted and then processed through a semi-automatic extraction process. We developed the MAP-OUTCOMES (MAchine learning model to Predict PFS and OS OUTCOMES) model using regularized logistic regression. This model was applied to a training set comprising 70% of the data, and 30% was used for a test set. RESULTS: A total of 91 unique RCTs were identified, and 82 (93 trial pairs) retained for the ML analysis. Most of the trials were phase III (75%), with 67% using PFS as the primary endpoint and a mean sample size of 350 patients per arm. The most common tumor type was genitourinary (22%), and small-molecule targeted agents (27%) were the most frequent regimen. The model's performance achieved 71% accuracy [95% confidence interval (CI) 0.536-0.862, P = 0.18] with an area under the curve (AUC) of 65% (95% CI 0.333-0.938, P = 0.157) and area under the precision-recall curve (AUPRC) of 90% (95% CI 0.779-0.995, P = 0.171) in the 28 trials used for the test set. CONCLUSIONS: The MAP-OUTCOMES model demonstrated the feasibility of using ML to predict survival-based outcomes from waterfall plots, thus providing a potential tool for early trial evaluation. Improving the model's performance with more training data and creating independent datasets are necessary steps to assess its generalizability for prospective clinical applications.
背景:通过瀑布图可视化的个体患者肿瘤反应深度(DepOR)是一个短期终点,可能代表无进展生存期(PFS)和总生存期(OS)等基于生存的结局的替代指标。我们假设在随机临床试验(RCT)中,可以使用一种新型机器学习(ML)计算模型从瀑布图预测PFS/OS。 材料与方法:对测试非细胞毒性全身治疗的II/III期RCT进行基于文献的检索,这些研究包括带有相应PFS/OS结果的瀑布图。根据是否达到预先设定的主要终点,将研究定义为阳性或阴性。手动提取试验数据和瀑布图图像,然后通过半自动提取过程进行处理。我们使用正则化逻辑回归开发了MAP - OUTCOMES(预测PFS和OS结局的机器学习模型)模型。该模型应用于包含70%数据的训练集,30%用于测试集。 结果:共识别出91项独特的RCT,82项(93个试验对)保留用于ML分析。大多数试验为III期(75%),67%以PFS作为主要终点,每组平均样本量为350例患者。最常见的肿瘤类型是泌尿生殖系统肿瘤(22%),小分子靶向药物(27%)是最常用的治疗方案。在用于测试集的28项试验中,该模型的性能准确率达到71%[95%置信区间(CI)0.536 - 0.862,P = 0.18],曲线下面积(AUC)为65%(95%CI 0.333 - 0.938,P = 0.157),精确召回率曲线下面积(AUPRC)为90%(95%CI 0.779 - 0.995,P = 0.171)。 结论:MAP - OUTCOMES模型证明了使用ML从瀑布图预测基于生存的结局的可行性,从而为早期试验评估提供了一个潜在工具。使用更多训练数据提高模型性能并创建独立数据集是评估其在前瞻性临床应用中的通用性的必要步骤。
Cochrane Database Syst Rev. 2018-2-6
Cochrane Database Syst Rev. 2020-3-23
Clin Orthop Relat Res. 2024-9-1
Health Technol Assess. 2006-9
Cochrane Database Syst Rev. 2015-1-9
Cochrane Database Syst Rev. 2020-10-19
JCO Clin Cancer Inform. 2023-9
Future Sci OA. 2022-2-10
Transl Cancer Res. 2021-2
Biochim Biophys Acta Rev Cancer. 2021-8
Nature. 2020-10