Orcutt Xavier, Chen Kan, Mamtani Ronac, Long Qi, Parikh Ravi B
Navajo Indian Health Service, Chinle, AZ, USA.
Department of Biostatistics, Harvard University, Boston, MA, USA.
Nat Med. 2025 Feb;31(2):457-465. doi: 10.1038/s41591-024-03352-5. Epub 2025 Jan 3.
Randomized controlled trials (RCTs) evaluating anti-cancer agents often lack generalizability to real-world oncology patients. Although restrictive eligibility criteria contribute to this issue, the role of selection bias related to prognostic risk remains unclear. In this study, we developed TrialTranslator, a framework designed to systematically evaluate the generalizability of RCTs for oncology therapies. Using a nationwide database of electronic health records from Flatiron Health, this framework emulates RCTs across three prognostic phenotypes identified through machine learning models. We applied this approach to 11 landmark RCTs that investigated anti-cancer regimens considered standard of care for the four most prevalent advanced solid malignancies. Our analyses reveal that patients in low-risk and medium-risk phenotypes exhibit survival times and treatment-associated survival benefits similar to those observed in RCTs. In contrast, high-risk phenotypes show significantly lower survival times and treatment-associated survival benefits compared to RCTs. Our results were corroborated by a comprehensive robustness assessment, including examinations of specific patient subgroups, holdout validation and semi-synthetic data simulation. These findings suggest that the prognostic heterogeneity among real-world oncology patients plays a substantial role in the limited generalizability of RCT results. Machine learning frameworks may facilitate individual patient-level decision support and estimation of real-world treatment benefits to guide trial design.
评估抗癌药物的随机对照试验(RCT)往往缺乏对真实世界肿瘤患者的普遍适用性。尽管严格的纳入标准导致了这一问题,但与预后风险相关的选择偏倚的作用仍不明确。在本研究中,我们开发了TrialTranslator,这是一个旨在系统评估肿瘤治疗RCT普遍适用性的框架。利用来自Flatiron Health的全国电子健康记录数据库,该框架通过机器学习模型模拟了三种预后表型的RCT。我们将这种方法应用于11项具有里程碑意义的RCT,这些试验研究了四种最常见的晚期实体恶性肿瘤的标准治疗抗癌方案。我们的分析表明,低风险和中等风险表型的患者的生存时间和治疗相关的生存获益与RCT中观察到的相似。相比之下,高风险表型的患者的生存时间和治疗相关的生存获益显著低于RCT。我们的结果通过全面的稳健性评估得到了证实,包括对特定患者亚组的检查、保留验证和半合成数据模拟。这些发现表明,真实世界肿瘤患者的预后异质性在RCT结果的有限普遍适用性中起着重要作用。机器学习框架可能有助于个体患者层面的决策支持和对真实世界治疗获益的估计,以指导试验设计。