Kennesaw State University, Kennesaw, USA.
Kennesaw State University, Marietta, USA.
BMC Bioinformatics. 2018 Dec 17;19(1):510. doi: 10.1186/s12859-018-2500-z.
Predicting prognosis in patients from large-scale genomic data is a fundamentally challenging problem in genomic medicine. However, the prognosis still remains poor in many diseases. The poor prognosis may be caused by high complexity of biological systems, where multiple biological components and their hierarchical relationships are involved. Moreover, it is challenging to develop robust computational solutions with high-dimension, low-sample size data.
In this study, we propose a Pathway-Associated Sparse Deep Neural Network (PASNet) that not only predicts patients' prognoses but also describes complex biological processes regarding biological pathways for prognosis. PASNet models a multilayered, hierarchical biological system of genes and pathways to predict clinical outcomes by leveraging deep learning. The sparse solution of PASNet provides the capability of model interpretability that most conventional fully-connected neural networks lack. We applied PASNet for long-term survival prediction in Glioblastoma multiforme (GBM), which is a primary brain cancer that shows poor prognostic performance. The predictive performance of PASNet was evaluated with multiple cross-validation experiments. PASNet showed a higher Area Under the Curve (AUC) and F1-score than previous long-term survival prediction classifiers, and the significance of PASNet's performance was assessed by Wilcoxon signed-rank test. Furthermore, the biological pathways, found in PASNet, were referred to as significant pathways in GBM in previous biology and medicine research.
PASNet can describe the different biological systems of clinical outcomes for prognostic prediction as well as predicting prognosis more accurately than the current state-of-the-art methods. PASNet is the first pathway-based deep neural network that represents hierarchical representations of genes and pathways and their nonlinear effects, to the best of our knowledge. Additionally, PASNet would be promising due to its flexible model representation and interpretability, embodying the strengths of deep learning. The open-source code of PASNet is available at https://github.com/DataX-JieHao/PASNet .
从大规模基因组数据中预测患者预后是基因组医学中一个具有挑战性的基本问题。然而,在许多疾病中,预后仍然很差。这种较差的预后可能是由于生物系统的复杂性造成的,其中涉及多个生物成分及其层次关系。此外,开发具有高维、小样本量数据的稳健计算解决方案具有挑战性。
在这项研究中,我们提出了一种通路相关稀疏深度神经网络(PASNet),它不仅可以预测患者的预后,还可以描述与生物通路相关的复杂生物过程。PASNet 通过利用深度学习来预测临床结果,对基因和通路的多层次、分层生物系统进行建模。PASNet 的稀疏解决方案提供了大多数传统全连接神经网络所缺乏的模型可解释性能力。我们将 PASNet 应用于胶质母细胞瘤(GBM)的长期生存预测,这是一种原发性脑癌,预后表现不佳。通过多次交叉验证实验评估了 PASNet 的预测性能。PASNet 的曲线下面积(AUC)和 F1 分数均高于以前的长期生存预测分类器,并且通过 Wilcoxon 符号秩检验评估了 PASNet 性能的显著性。此外,在 PASNet 中发现的生物通路被认为是以前生物学和医学研究中 GBM 的重要通路。
PASNet 可以描述临床结果的不同生物系统,用于预后预测,并且比当前最先进的方法更准确地预测预后。据我们所知,PASNet 是第一个基于通路的深度神经网络,它代表了基因和通路及其非线性效应的层次表示。此外,由于其灵活的模型表示和可解释性,PASNet 具有很大的潜力,体现了深度学习的优势。PASNet 的开源代码可在 https://github.com/DataX-JieHao/PASNet 上获得。