School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, 47907, USA.
Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, 46202, USA.
BMC Med Genomics. 2020 Apr 3;13(Suppl 5):41. doi: 10.1186/s12920-020-0686-1.
Recent advances in kernel-based Deep Learning models have introduced a new era in medical research. Originally designed for pattern recognition and image processing, Deep Learning models are now applied to survival prognosis of cancer patients. Specifically, Deep Learning versions of the Cox proportional hazards models are trained with transcriptomic data to predict survival outcomes in cancer patients.
In this study, a broad analysis was performed on TCGA cancers using a variety of Deep Learning-based models, including Cox-nnet, DeepSurv, and a method proposed by our group named AECOX (AutoEncoder with Cox regression network). Concordance index and p-value of the log-rank test are used to evaluate the model performances.
All models show competitive results across 12 cancer types. The last hidden layers of the Deep Learning approaches are lower dimensional representations of the input data that can be used for feature reduction and visualization. Furthermore, the prognosis performances reveal a negative correlation between model accuracy, overall survival time statistics, and tumor mutation burden (TMB), suggesting an association among overall survival time, TMB, and prognosis prediction accuracy.
Deep Learning based algorithms demonstrate superior performances than traditional machine learning based models. The cancer prognosis results measured in concordance index are indistinguishable across models while are highly variable across cancers. These findings shedding some light into the relationships between patient characteristics and survival learnability on a pan-cancer level.
基于核的深度学习模型的最新进展开创了医学研究的新纪元。深度学习模型最初设计用于模式识别和图像处理,现在已应用于癌症患者的生存预后。具体来说,基于转录组数据训练了 Cox 比例风险模型的深度学习版本,以预测癌症患者的生存结局。
本研究使用多种基于深度学习的模型对 TCGA 癌症进行了广泛分析,包括 Cox-nnet、DeepSurv 和我们团队提出的名为 AECOX(具有 Cox 回归网络的自动编码器)的方法。一致性指数和对数秩检验的 p 值用于评估模型性能。
所有模型在 12 种癌症类型中均表现出有竞争力的结果。深度学习方法的最后一个隐藏层是输入数据的低维表示,可以用于特征降维和可视化。此外,预后表现揭示了模型准确性、总生存时间统计和肿瘤突变负担(TMB)之间的负相关,表明总生存时间、TMB 和预后预测准确性之间存在关联。
基于深度学习的算法表现优于传统基于机器学习的模型。在一致性指数上测量的癌症预后结果在模型之间不可区分,而在癌症之间高度可变。这些发现为患者特征与泛癌水平上的生存可学习性之间的关系提供了一些启示。