基于修剪的稳健惩罚 Cox 回归识别高维生存数据中的有影响观测值

Identification of influential observations in high-dimensional survival data through robust penalized Cox regression based on trimming.

机构信息

Department of Health Statistics, School of Public Health and Management, Binzhou Medical University, Yantai City, Shandong 264003, China.

Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan City, Shanxi 030001, China.

出版信息

Math Biosci Eng. 2023 Jan 11;20(3):5352-5378. doi: 10.3934/mbe.2023248.

DOI:10.3934/mbe.2023248

PMID:36896549

Abstract

Penalized Cox regression can efficiently be used for the determination of biomarkers in high-dimensional genomic data related to disease prognosis. However, results of Penalized Cox regression is influenced by the heterogeneity of the samples who have different dependent structure between survival time and covariates from most individuals. These observations are called influential observations or outliers. A robust penalized Cox model (Reweighted Elastic Net-type maximum trimmed partial likelihood estimator, Rwt MTPL-EN) is proposed to improve the prediction accuracy and identify influential observations. A new algorithm AR-Cstep to solve Rwt MTPL-EN model is also proposed. This method has been validated by simulation study and application to glioma microarray expression data. When there were no outliers, the results of Rwt MTPL-EN were close to the Elastic Net (EN). When outliers existed, the results of EN were impacted by outliers. And whenever the censored rate was large or low, the robust Rwt MTPL-EN performed better than EN. and could resist the outliers in both predictors and response. In terms of outliers detection accuracy, Rwt MTPL-EN was much higher than EN. The outliers who "lived too long" made EN perform worse, but were accurately detected by Rwt MTPL-EN. Through the analysis of glioma gene expression data, most of the outliers identified by EN were those "failed too early", but most of them were not obvious outliers according to risk estimated from omics data or clinical variables. Most of the outliers identified by Rwt MTPL-EN were those who "lived too long", and most of them were obvious outliers according to risk estimated from omics data or clinical variables. Rwt MTPL-EN can be adopted to detect influential observations in high-dimensional survival data.

摘要

惩罚 Cox 回归可有效地用于确定与疾病预后相关的高维基因组数据中的生物标志物。然而，惩罚 Cox 回归的结果受到样本异质性的影响，这些样本的生存时间和协变量之间的依赖结构与大多数个体不同。这些观察结果称为有影响的观察结果或异常值。本文提出了一种稳健的惩罚 Cox 模型（重加权弹性网络型最大修剪部分似然估计量，Rwt MTPL-EN），以提高预测准确性并识别有影响的观察结果。还提出了一种新的算法 AR-Cstep 来求解 Rwt MTPL-EN 模型。该方法通过模拟研究和Glioma 微阵列表达数据的应用得到了验证。当不存在异常值时，Rwt MTPL-EN 的结果与弹性网络（EN）接近。当存在异常值时，EN 的结果会受到异常值的影响。并且，无论删失率高还是低，稳健的 Rwt MTPL-EN 都比 EN 表现更好。并且可以抵抗预测因子和响应中的异常值。在异常值检测准确性方面，Rwt MTPL-EN 明显高于 EN。那些“活得太久”的异常值使 EN 的表现更差，但却被 Rwt MTPL-EN 准确地检测到。通过对Glioma 基因表达数据的分析，EN 识别出的大多数异常值是那些“过早失败”的异常值，但根据来自组学数据或临床变量的风险估计，大多数异常值并不是明显的异常值。Rwt MTPL-EN 识别出的大多数异常值是那些“活得太久”的异常值，并且根据来自组学数据或临床变量的风险估计，大多数异常值都是明显的异常值。Rwt MTPL-EN 可用于检测高维生存数据中的有影响的观察结果。

相似文献

Identification of influential observations in high-dimensional survival data through robust penalized Cox regression based on trimming.基于修剪的稳健惩罚 Cox 回归识别高维生存数据中的有影响观测值

Math Biosci Eng. 2023 Jan 11;20(3):5352-5378. doi: 10.3934/mbe.2023248.

An Efficient Algorithm for the Detection of Outliers in Mislabeled Omics Data.一种用于检测组学数据中错误标记异常值的高效算法。

Comput Math Methods Med. 2021 Dec 22;2021:9436582. doi: 10.1155/2021/9436582. eCollection 2021.

Predicting censored survival data based on the interactions between meta-dimensional omics data in breast cancer.基于乳腺癌元维度组学数据间的相互作用预测删失生存数据。

J Biomed Inform. 2015 Aug;56:220-8. doi: 10.1016/j.jbi.2015.05.019. Epub 2015 Jun 3.

Comparison of methods for the detection of outliers and associated biomarkers in mislabeled omics data.比较用于检测组学数据中标记错误的异常值和相关生物标志物的方法。

BMC Bioinformatics. 2020 Aug 14;21(1):357. doi: 10.1186/s12859-020-03653-9.

NCC-AUC: an AUC optimization method to identify multi-biomarker panel for cancer prognosis from genomic and clinical data.NCC-AUC：一种 AUC 优化方法，用于从基因组和临床数据中识别用于癌症预后的多生物标志物组。

Bioinformatics. 2015 Oct 15;31(20):3330-8. doi: 10.1093/bioinformatics/btv374. Epub 2015 Jun 18.

Identification of clinically relevant features in hypertensive patients using penalized regression: a case study of cardiovascular events.使用惩罚回归识别高血压患者的临床相关特征：心血管事件的案例研究。

Med Biol Eng Comput. 2019 Sep;57(9):2011-2026. doi: 10.1007/s11517-019-02007-9. Epub 2019 Jul 25.

Robust estimation of the expected survival probabilities from high-dimensional Cox models with biomarker-by-treatment interactions in randomized clinical trials.在随机临床试验中，通过生物标志物与治疗的相互作用，从高维Cox模型中稳健估计预期生存概率。

BMC Med Res Methodol. 2017 May 22;17(1):83. doi: 10.1186/s12874-017-0354-0.

Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data.高维小样本情况下的惩罚Cox回归分析及其在微阵列基因表达数据中的应用

Bioinformatics. 2005 Jul 1;21(13):3001-8. doi: 10.1093/bioinformatics/bti422. Epub 2005 Apr 6.

A surrogate ℓ sparse Cox's regression with applications to sparse high-dimensional massive sample size time-to-event data.带代理 ℓ 稀疏 Cox 回归及其在稀疏高维大规模生存时间数据中的应用。

Stat Med. 2020 Mar 15;39(6):675-686. doi: 10.1002/sim.8438. Epub 2019 Dec 8.

An elastic-net penalized expectile regression with applications.一种具有应用的弹性网络惩罚期望分位数回归。

J Appl Stat. 2020 Jun 30;48(12):2205-2230. doi: 10.1080/02664763.2020.1787355. eCollection 2021.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于修剪的稳健惩罚 Cox 回归识别高维生存数据中的有影响观测值

Identification of influential observations in high-dimensional survival data through robust penalized Cox regression based on trimming.

机构信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献