Suppr超能文献

基于伪观测的 AUC 损失的多类型数据生存堆叠。

Survival stacking with multiple data types using pseudo-observation-based-AUC loss.

机构信息

Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Solna, Sweden.

出版信息

J Biopharm Stat. 2022 Nov 2;32(6):858-870. doi: 10.1080/10543406.2022.2041655. Epub 2022 May 15.

Abstract

There have been many strategies to adapt machine learning algorithms to account for right censored observations in survival data in order to build more accurate risk prediction models. These adaptions have included pre-processing steps such as pseudo-observation transformation of the survival outcome or inverse probability of censoring weighted (IPCW) bootstrapping of the observed binary indicator of an event prior to a time point of interest. These pre-processing steps allow existing or newly developed machine learning methods, which were not specifically developed with time-to-event data in mind, to be applied to right censored survival data for predicting the risk of experiencing an event. Stacking or ensemble methods can improve on risk predictions, but in general, the combination of pseudo-observation-based algorithms, IPCW bootstrapping, IPC weighting of the methods directly, and methods developed specifically for survival has not been considered in the same ensemble. In this paper, we propose an ensemble procedure based on the area under the pseudo-observation-based-time-dependent ROC curve to optimally stack predictions from any survival or survival adapted algorithm. The real application results show that our proposed method can improve on single survival based methods such as survival random forest or on other strategies that use a pre-processing step such as inverse probability of censoring weighted bagging or pseudo-observations alone.

摘要

已经有许多策略可以使机器学习算法适应生存数据中的右删失观测值,以构建更准确的风险预测模型。这些自适应方法包括预处理步骤,例如对生存结局进行伪观测转换,或者在感兴趣的时间点之前对事件的观测二元指示符进行逆概率 censoring 加权(Inverse Probability of Censoring Weighting,IPCW)引导。这些预处理步骤允许现有的或新开发的机器学习方法(这些方法不是专门为时间事件数据开发的)应用于右删失生存数据,以预测经历事件的风险。堆叠或集成方法可以提高风险预测,但一般来说,基于伪观测的算法、IPCW 引导、方法的直接 IPC 加权以及专门为生存开发的方法的组合尚未在同一集成中考虑。在本文中,我们提出了一种基于基于伪观测的时间相关 ROC 曲线下面积的集成程序,以最优地堆叠任何生存或生存适应算法的预测。实际应用结果表明,我们提出的方法可以改进基于生存的单一方法,如生存随机森林,或者改进其他使用预处理步骤(如逆概率 censoring 加权套袋或伪观测)的策略。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验