Chen Guorong, Wang Sijian, Sun Guannan, Pan Huanxue
Department of Finance, Beijing Forestry University, Beijing, China.
Department of Statistics and Biostatistics, Rutgers University, New Brunswick, NJ, USA.
Acta Biotheor. 2019 Sep;67(3):225-251. doi: 10.1007/s10441-019-09349-9. Epub 2019 May 28.
When relating genomic data to survival outcomes, there are three main challenges that are the censored survival outcomes, the high-dimensionality of the genomic data, and the non-normality of data. We propose a method to tackle these challenges simultaneously and obtain a robust estimation of detecting significant genes related to survival outcomes based on Accelerated Failure Time (AFT) model. Specifically, we include a general loss function to the AFT model, adopt model regularization and shrinkage technique, cope with parameters tuning and model selection, and develop an algorithm based on unified Expectation-Maximization approach for easy implementation. Simulation results demonstrate the advantages of the proposed method compared with existing methods when the data has heavy-tailed errors and correlated covariates. Two real case studies on patients are provided to illustrate the application of the proposed method.
在将基因组数据与生存结果相关联时,存在三个主要挑战,即删失生存结果、基因组数据的高维度以及数据的非正态性。我们提出了一种方法来同时应对这些挑战,并基于加速失效时间(AFT)模型对检测与生存结果相关的显著基因进行稳健估计。具体而言,我们在AFT模型中纳入一个通用损失函数,采用模型正则化和收缩技术,处理参数调整和模型选择问题,并基于统一的期望最大化方法开发一种易于实现的算法。模拟结果表明,当数据存在重尾误差和相关协变量时,与现有方法相比,所提方法具有优势。提供了两个关于患者的真实案例研究来说明所提方法的应用。