Suppr超能文献

快速分治稀疏 Cox 回归。

A fast divide-and-conquer sparse Cox regression.

机构信息

Department of Environmental Health, Harvard T. H. Chan School of Public Health, 401 Park Drive West, Boston, MA, 02215, USA.

Department of Biostatistics, Harvard T. H. Chan School of Public Health, 655 Huntington Avenue, Boston, MA, 02115, USA.

出版信息

Biostatistics. 2021 Apr 10;22(2):381-401. doi: 10.1093/biostatistics/kxz036.

Abstract

We propose a computationally and statistically efficient divide-and-conquer (DAC) algorithm to fit sparse Cox regression to massive datasets where the sample size $n_0$ is exceedingly large and the covariate dimension $p$ is not small but $n_0\gg p$. The proposed algorithm achieves computational efficiency through a one-step linear approximation followed by a least square approximation to the partial likelihood (PL). These sequences of linearization enable us to maximize the PL with only a small subset and perform penalized estimation via a fast approximation to the PL. The algorithm is applicable for the analysis of both time-independent and time-dependent survival data. Simulations suggest that the proposed DAC algorithm substantially outperforms the full sample-based estimators and the existing DAC algorithm with respect to the computational speed, while it achieves similar statistical efficiency as the full sample-based estimators. The proposed algorithm was applied to extraordinarily large survival datasets for the prediction of heart failure-specific readmission within 30 days among Medicare heart failure patients.

摘要

我们提出了一种计算和统计上高效的分治 (DAC) 算法,用于拟合大规模数据集的稀疏 Cox 回归,其中样本量 $n_0$ 非常大,协变量维度 $p$ 不小,但 $n_0\gg p$。所提出的算法通过一步线性近似和随后对部分似然 (PL) 的最小二乘近似实现计算效率。这些线性化序列使我们能够仅使用一小部分数据集最大化 PL,并通过对 PL 的快速逼近进行惩罚估计。该算法适用于分析时间独立和时间相关的生存数据。模拟表明,与基于全样本的估计器和现有的 DAC 算法相比,所提出的 DAC 算法在计算速度方面有了显著的提高,同时在统计效率上与基于全样本的估计器相当。该算法应用于极其庞大的生存数据集,用于预测 Medicare 心力衰竭患者 30 天内心力衰竭特定再入院的风险。

相似文献

1
A fast divide-and-conquer sparse Cox regression.快速分治稀疏 Cox 回归。
Biostatistics. 2021 Apr 10;22(2):381-401. doi: 10.1093/biostatistics/kxz036.
9
Sieve estimation of Cox models with latent structures.具有潜在结构的Cox模型的筛法估计
Biometrics. 2016 Dec;72(4):1086-1097. doi: 10.1111/biom.12529. Epub 2016 Jul 6.
10
Distributed learning for sketched kernel regression.草图核回归的分布式学习。
Neural Netw. 2021 Nov;143:368-376. doi: 10.1016/j.neunet.2021.06.020. Epub 2021 Jun 25.

引用本文的文献

8
Scalable Algorithms for Large Competing Risks Data.适用于大量竞争风险数据的可扩展算法
J Comput Graph Stat. 2021;30(3):685-693. doi: 10.1080/10618600.2020.1841650. Epub 2020 Dec 11.
10
Online Updating of Survival Analysis.生存分析的在线更新
J Comput Graph Stat. 2021;30(4):1209-1223. doi: 10.1080/10618600.2020.1870481. Epub 2021 Mar 8.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验