Suppr超能文献

使用信息准则选择时变系数风险模型分析生存数据的平滑参数。

Using information criteria to select smoothing parameters when analyzing survival data with time-varying coefficient hazard models.

机构信息

School of Public Health, Department of Biostatistics, University of Michigan, Ann Arbor, USA.

出版信息

Stat Methods Med Res. 2023 Sep;32(9):1664-1679. doi: 10.1177/09622802231181471. Epub 2023 Jul 5.

Abstract

Analyzing the large-scale survival data from the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) Program may help guide the management of cancer. Detecting and characterizing the time-varying effects of factors collected at the time of diagnosis could reveal important and useful patterns. However, fitting a time-varying effect model by maximizing the partial likelihood with such large-scale survival data is not feasible with most existing software. Moreover, estimating time-varying coefficients using spline based approaches requires a moderate number of knots, which may lead to unstable estimation and over-fitting issues. To resolve these issues, adding a penalty term greatly aids estimation. The selection of penalty smoothing parameters is difficult in this time-varying setting, as traditional ways like using Akaike information criterion do not work, while cross-validation methods have a heavy computational burden, leading to unstable selections. We propose modified information criteria to determine the smoothing parameter and a parallelized Newton-based algorithm for estimation. We conduct simulations to evaluate the performance of the proposed method. We find that penalization with the smoothing parameter chosen by a modified information criteria is effective at reducing the mean squared error of the estimated time-varying coefficients. Compared to a number of alternatives, we find that the estimates of the variance derived from Bayesian considerations have the best coverage rates of confidence intervals. We apply the method to SEER head-and-neck, colon, prostate, and pancreatic cancer data and detect the time-varying nature of various risk factors.

摘要

分析美国国家癌症研究所的监测、流行病学和最终结果 (SEER) 计划的大规模生存数据,可能有助于指导癌症的管理。检测和描述在诊断时收集的因素的时变效应,可能揭示重要和有用的模式。然而,用大多数现有的软件通过最大化部分似然来拟合时变效应模型是不可行的。此外,使用基于样条的方法估计时变系数需要中等数量的节点,这可能导致不稳定的估计和过拟合问题。为了解决这些问题,添加惩罚项有助于估计。在这种时变环境中,选择惩罚平滑参数是困难的,因为传统的方法(如使用赤池信息量准则)不起作用,而交叉验证方法的计算负担很重,导致选择不稳定。我们提出了修改后的信息准则来确定平滑参数,并提出了一种并行牛顿算法进行估计。我们进行了模拟来评估所提出方法的性能。我们发现,通过修改后的信息准则选择的平滑参数进行惩罚可以有效地降低估计的时变系数的均方误差。与许多替代方法相比,我们发现,基于贝叶斯考虑的方差估计具有最佳的置信区间覆盖率。我们将该方法应用于 SEER 头颈部、结肠、前列腺和胰腺癌数据,并检测各种风险因素的时变性质。

相似文献

本文引用的文献

4
Subtleties in the interpretation of hazard contrasts.风险对比解读中的细微差别。
Lifetime Data Anal. 2020 Oct;26(4):833-855. doi: 10.1007/s10985-020-09501-5. Epub 2020 Jul 11.
9

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验