• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于秩的高维生存数据贪婪模型平均法

Rank-Based Greedy Model Averaging for High-Dimensional Survival Data.

作者信息

He Baihua, Ma Shuangge, Zhang Xinyu, Zhu Li-Xing

机构信息

International Institute of Finance, School of Management, University of Science and Technology of China, Hefei, China.

Department of Biostatistics, Yale University, New Haven, CT.

出版信息

J Am Stat Assoc. 2023;118(544):2658-2670. doi: 10.1080/01621459.2022.2070070. Epub 2022 Jul 7.

DOI:10.1080/01621459.2022.2070070
PMID:39552724
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11566305/
Abstract

Model averaging is an effective way to enhance prediction accuracy. However, most previous works focus on low-dimensional settings with completely observed responses. To attain an accurate prediction for the risk effect of survival data with high-dimensional predictors, we propose a novel method: rank-based greedy (RG) model averaging. Specifically, adopting the transformation model with splitting predictors as working models, we doubly use the smooth concordance index function to derive the candidate predictions and optimal model weights. The final prediction is achieved by weighted averaging all the candidates. Our approach is flexible, computationally efficient, and robust against model misspecification, as it neither requires the correctness of a joint model nor involves the estimation of the transformation function. We further adopt the greedy algorithm for high dimensions. Theoretically, we derive an asymptotic error bound for the optimal weights under some mild conditions. In addition, the summation of weights assigned to the correct candidate submodels is proven to approach one in probability when there are correct models included among the candidate submodels. Extensive numerical studies are carried out using both simulated and real datasets to show the proposed approach's robust performance compared to the existing regularization approaches. Supplementary materials for this article are available online.

摘要

模型平均是提高预测准确性的有效方法。然而,以前的大多数工作都集中在具有完全观测响应的低维设置上。为了对具有高维预测变量的生存数据的风险效应进行准确预测,我们提出了一种新方法:基于秩的贪婪(RG)模型平均。具体来说,采用将预测变量拆分的变换模型作为工作模型,我们双重使用平滑一致性指数函数来推导候选预测和最优模型权重。最终预测通过对所有候选进行加权平均来实现。我们的方法灵活、计算效率高且对模型误设具有鲁棒性,因为它既不需要联合模型的正确性,也不涉及变换函数的估计。我们进一步针对高维采用贪婪算法。从理论上讲,在一些温和条件下,我们推导了最优权重的渐近误差界。此外,当候选子模型中包含正确模型时,证明分配给正确候选子模型的权重之和依概率趋近于1。使用模拟和真实数据集进行了广泛的数值研究,以展示所提出的方法与现有正则化方法相比的稳健性能。本文的补充材料可在线获取。

相似文献

1
Rank-Based Greedy Model Averaging for High-Dimensional Survival Data.基于秩的高维生存数据贪婪模型平均法
J Am Stat Assoc. 2023;118(544):2658-2670. doi: 10.1080/01621459.2022.2070070. Epub 2022 Jul 7.
2
Martingale-residual-based greedy model averaging for high-dimensional current status data.基于鞅残差的贪婪模型平均法在高维现状数据中的应用。
Stat Med. 2024 Apr 30;43(9):1726-1742. doi: 10.1002/sim.10037. Epub 2024 Feb 21.
3
Jackknife Model Averaging Prediction Methods for Complex Phenotypes with Gene Expression Levels by Integrating External Pathway Information.通过整合外部通路信息对具有基因表达水平的复杂表型进行折刀法模型平均预测方法
Comput Math Methods Med. 2019 Apr 8;2019:2807470. doi: 10.1155/2019/2807470. eCollection 2019.
4
Robust reduced-rank regression.稳健降秩回归
Biometrika. 2017 Sep;104(3):633-647. doi: 10.1093/biomet/asx032. Epub 2017 Jul 12.
5
Prognostic score-based model averaging approach for propensity score estimation.基于预后评分的模型平均倾向评分估计方法。
BMC Med Res Methodol. 2024 Oct 3;24(1):228. doi: 10.1186/s12874-024-02350-y.
6
Improved two-stage model averaging for high-dimensional linear regression, with application to Riboflavin data analysis.改进的高维线性回归两阶段模型平均法及其在核黄素数据分析中的应用。
BMC Bioinformatics. 2021 Mar 25;22(1):155. doi: 10.1186/s12859-021-04053-3.
7
Deep convolutional neural network and IoT technology for healthcare.用于医疗保健的深度卷积神经网络和物联网技术。
Digit Health. 2024 Jan 17;10:20552076231220123. doi: 10.1177/20552076231220123. eCollection 2024 Jan-Dec.
8
Parsimonious Model Averaging With a Diverging Number of Parameters.参数数量不断增加时的简约模型平均法。
J Am Stat Assoc. 2020;115(530):972-984. doi: 10.1080/01621459.2019.1604363. Epub 2019 Jun 19.
9
Robust model averaging approach by Mallows-type criterion.基于马氏距离准则的稳健模型平均法。
Biometrics. 2024 Oct 3;80(4). doi: 10.1093/biomtc/ujae128.
10
Focused Information Criterion and Model Averaging with Generalized Rank Regression.聚焦信息准则与广义秩回归的模型平均法
Stat Probab Lett. 2017 Mar;122:11-19. doi: 10.1016/j.spl.2016.10.020. Epub 2016 Oct 31.

引用本文的文献

1
Joint modeling of mixed outcomes using a rank-based sparse neural network.使用基于秩的稀疏神经网络对混合结果进行联合建模。
J Biomed Inform. 2025 Jul 5;169:104870. doi: 10.1016/j.jbi.2025.104870.

本文引用的文献

1
Testing and Confidence Intervals for High Dimensional Proportional Hazards Model.高维比例风险模型的检验与置信区间
J R Stat Soc Series B Stat Methodol. 2017 Nov;79(5):1415-1437. doi: 10.1111/rssb.12224. Epub 2016 Dec 26.
2
Quantile Graphical Models: Bayesian Approaches.分位数图形模型:贝叶斯方法。
J Mach Learn Res. 2020;21(79):1-47.
3
Testing and Estimation of Social Network Dependence With Time to Event Data.基于事件发生时间数据的社交网络依赖测试与估计
J Am Stat Assoc. 2020;115(530):1-28. doi: 10.1080/01621459.2019.1617153. Epub 2019 Jun 19.
4
The Lq- NORM LEARNING FOR ULTRAHIGH-DIMENSIONAL SURVIVAL DATA: AN INTEGRATIVE FRAMEWORK.超高维生存数据的Lq范数学习:一个整合框架
Stat Sin. 2020 Jul;30(3):1213-1233. doi: 10.5705/ss.202017.0537.
5
Identification of a Sixteen-gene Prognostic Biomarker for Lung Adenocarcinoma Using a Machine Learning Method.使用机器学习方法鉴定肺腺癌的十六基因预后生物标志物
J Cancer. 2020 Jan 1;11(5):1288-1298. doi: 10.7150/jca.34585. eCollection 2020.
6
A greedy feature selection algorithm for Big Data of high dimensionality.一种用于高维大数据的贪心特征选择算法。
Mach Learn. 2019;108(2):149-202. doi: 10.1007/s10994-018-5748-7. Epub 2018 Aug 7.
7
Varying-coefficient semiparametric model averaging prediction.变系数半参数模型平均预测
Biometrics. 2018 Dec;74(4):1417-1426. doi: 10.1111/biom.12904. Epub 2018 May 18.
8
A Consistent Information Criterion for Support Vector Machines in Diverging Model Spaces.发散模型空间中支持向量机的一致性信息准则
J Mach Learn Res. 2016;17(16):1-26.
9
Penalized variable selection with U-estimates.基于U估计的惩罚变量选择
J Nonparametr Stat. 2010;22(4):499-515. doi: 10.1080/10485250903348781.
10
A semiparametric approach for the nonparametric transformation survival model with multiple covariates.一种用于具有多个协变量的非参数变换生存模型的半参数方法。
Biostatistics. 2007 Apr;8(2):197-211. doi: 10.1093/biostatistics/kxl001. Epub 2006 May 2.