Product Development, Genentech, Inc, South San Francisco, California, USA.
Department of Statistics, Stanford University, Stanford, California, USA.
Stat Med. 2021 Nov 10;40(25):5487-5500. doi: 10.1002/sim.9136. Epub 2021 Jul 24.
High-dimensional data are becoming increasingly common in the medical field as large volumes of patient information are collected and processed by high-throughput screening, electronic health records, and comprehensive genomic testing. Statistical models that attempt to study the effects of many predictors on survival typically implement feature selection or penalized methods to mitigate the undesirable consequences of overfitting. In some cases survival data are also left-truncated which can give rise to an immortal time bias, but penalized survival methods that adjust for left truncation are not commonly implemented. To address these challenges, we apply a penalized Cox proportional hazards model for left-truncated and right-censored survival data and assess implications of left truncation adjustment on bias and interpretation. We use simulation studies and a high-dimensional, real-world clinico-genomic database to highlight the pitfalls of failing to account for left truncation in survival modeling.
随着高通量筛选、电子健康记录和全面基因组测试收集和处理大量患者信息,医学领域中越来越常见高维数据。试图研究许多预测因素对生存影响的统计模型通常采用特征选择或惩罚方法来减轻过度拟合的不良后果。在某些情况下,生存数据也会左截断,这可能会导致不朽时间偏差,但调整左截断的惩罚生存方法并不常见。为了解决这些挑战,我们应用了一种惩罚性 Cox 比例风险模型来处理左截断和右删失生存数据,并评估左截断调整对偏差和解释的影响。我们使用模拟研究和一个高维的真实临床基因组数据库来突出强调在生存建模中忽略左截断所带来的陷阱。