Division of Biostatistics, Institute for Health & Equity, Medical College of Wisconsin, Milwaukee, WI, USA.
Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.
Sci Rep. 2024 Jul 1;14(1):15004. doi: 10.1038/s41598-024-64977-7.
The tumor microenvironment (TME) plays a fundamental role in tumorigenesis, tumor progression, and anti-cancer immunity potential of emerging cancer therapeutics. Understanding inter-patient TME heterogeneity, however, remains a challenge to efficient drug development. This article applies recent advances in machine learning (ML) for survival analysis to a retrospective study of NSCLC patients who received definitive surgical resection and immune pathology following surgery. ML methods are compared for their effectiveness in identifying prognostic subtypes. Six survival models, including Cox regression and five survival machine learning methods, were calibrated and applied to predict survival for NSCLC patients based on PD-L1 expression, CD3 expression, and ten baseline patient characteristics. Prognostic subregions of the biomarker space are delineated for each method using synthetic patient data augmentation and compared between models for overall survival concordance. A total of 423 NSCLC patients (46% female; median age [inter quantile range]: 67 [60-73]) treated with definite surgical resection were included in the study. And 219 (52%) patients experienced events during the observation period consisting of a maximum follow-up of 10 years and median follow up 78 months. The random survival forest (RSF) achieved the highest predictive accuracy, with a C-index of 0.84. The resultant biomarker subtypes demonstrate that patients with high PD-L1 expression combined with low CD3 counts experience higher risk of death within five-years of surgical resection.
肿瘤微环境(TME)在肿瘤发生、肿瘤进展和新兴癌症治疗的抗癌免疫潜力中起着至关重要的作用。然而,理解患者间 TME 异质性仍然是提高药物开发效率的挑战。本文将机器学习(ML)在生存分析中的最新进展应用于接受确定性手术切除和术后免疫病理检查的 NSCLC 患者的回顾性研究。比较了 ML 方法在识别预后亚型方面的有效性。基于 PD-L1 表达、CD3 表达和 10 项基线患者特征,针对 6 种生存模型(包括 Cox 回归和 5 种生存机器学习方法)进行了校准,并应用于预测 NSCLC 患者的生存情况。使用合成患者数据扩充来划定每个方法的生物标志物空间的预后子区域,并比较模型之间的总体生存一致性。共有 423 名接受确定性手术切除治疗的 NSCLC 患者(46%为女性;中位年龄[四分位间距]:67 [60-73])纳入研究。在观察期间有 219 名(52%)患者发生了事件,观察期最长为 10 年,中位随访时间为 78 个月。随机生存森林(RSF)实现了最高的预测准确性,C 指数为 0.84。由此产生的生物标志物亚型表明,PD-L1 表达高且 CD3 计数低的患者在手术后五年内死亡风险更高。