Department of Mathematics, William & Mary, Williamsburg, VA 23185, United States.
Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, United States.
Biometrics. 2024 Jan 29;80(1). doi: 10.1093/biomtc/ujad024.
Lung cancer is a leading cause of cancer mortality globally, highlighting the importance of understanding its mortality risks to design effective patient-centered therapies. The National Lung Screening Trial (NLST) employed computed tomography texture analysis, which provides objective measurements of texture patterns on CT scans, to quantify the mortality risks of lung cancer patients. Partially linear Cox models have gained popularity for survival analysis by dissecting the hazard function into parametric and nonparametric components, allowing for the effective incorporation of both well-established risk factors (such as age and clinical variables) and emerging risk factors (eg, image features) within a unified framework. However, when the dimension of parametric components exceeds the sample size, the task of model fitting becomes formidable, while nonparametric modeling grapples with the curse of dimensionality. We propose a novel Penalized Deep Partially Linear Cox Model (Penalized DPLC), which incorporates the smoothly clipped absolute deviation (SCAD) penalty to select important texture features and employs a deep neural network to estimate the nonparametric component of the model. We prove the convergence and asymptotic properties of the estimator and compare it to other methods through extensive simulation studies, evaluating its performance in risk prediction and feature selection. The proposed method is applied to the NLST study dataset to uncover the effects of key clinical and imaging risk factors on patients' survival. Our findings provide valuable insights into the relationship between these factors and survival outcomes.
肺癌是全球癌症死亡的主要原因,这凸显了了解其死亡风险以设计有效的以患者为中心的治疗方法的重要性。国家肺癌筛查试验 (NLST) 使用计算机断层扫描纹理分析,对 CT 扫描上的纹理模式进行客观测量,以量化肺癌患者的死亡风险。部分线性 Cox 模型通过将危险函数分解为参数和非参数分量,在统一框架内有效纳入已确立的风险因素(如年龄和临床变量)和新兴风险因素(例如图像特征),在生存分析中越来越受欢迎。然而,当参数分量的维度超过样本量时,模型拟合的任务变得艰巨,而非参数建模则面临维度诅咒的问题。我们提出了一种新的惩罚性深度部分线性 Cox 模型 (Penalized DPLC),它结合了平滑剪辑绝对偏差 (SCAD) 惩罚来选择重要的纹理特征,并使用深度神经网络来估计模型的非参数分量。我们证明了估计器的收敛性和渐近性质,并通过广泛的模拟研究将其与其他方法进行比较,评估其在风险预测和特征选择方面的性能。该方法应用于 NLST 研究数据集,以揭示关键临床和成像风险因素对患者生存的影响。我们的研究结果提供了这些因素与生存结果之间关系的有价值的见解。