Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.
Department of Computer Science, Virginia State University, Petersburg, VA 23806, USA.
Bioinformatics. 2021 Jul 12;37(Suppl_1):i443-i450. doi: 10.1093/bioinformatics/btab285.
Convolutional neural networks (CNNs) have achieved great success in the areas of image processing and computer vision, handling grid-structured inputs and efficiently capturing local dependencies through multiple levels of abstraction. However, a lack of interpretability remains a key barrier to the adoption of deep neural networks, particularly in predictive modeling of disease outcomes. Moreover, because biological array data are generally represented in a non-grid structured format, CNNs cannot be applied directly.
To address these issues, we propose a novel method, called PathCNN, that constructs an interpretable CNN model on integrated multi-omics data using a newly defined pathway image. PathCNN showed promising predictive performance in differentiating between long-term survival (LTS) and non-LTS when applied to glioblastoma multiforme (GBM). The adoption of a visualization tool coupled with statistical analysis enabled the identification of plausible pathways associated with survival in GBM. In summary, PathCNN demonstrates that CNNs can be effectively applied to multi-omics data in an interpretable manner, resulting in promising predictive power while identifying key biological correlates of disease.
The source code is freely available at: https://github.com/mskspi/PathCNN.
卷积神经网络(CNN)在图像处理和计算机视觉领域取得了巨大成功,能够处理网格结构的输入,并通过多层次的抽象有效地捕获局部依赖性。然而,可解释性的缺乏仍然是深度神经网络采用的一个关键障碍,特别是在疾病结果的预测建模方面。此外,由于生物阵列数据通常以非网格结构的格式表示,因此不能直接应用 CNN。
为了解决这些问题,我们提出了一种新的方法,称为 PathCNN,该方法使用新定义的途径图像在集成的多组学数据上构建可解释的 CNN 模型。PathCNN 在应用于多形性胶质母细胞瘤(GBM)时,在区分长期生存(LTS)和非 LTS 方面表现出了有前途的预测性能。采用可视化工具并结合统计分析,确定了与 GBM 生存相关的可能途径。总之,PathCNN 表明 CNN 可以以可解释的方式有效地应用于多组学数据,从而在识别疾病的关键生物学相关性的同时,具有有前途的预测能力。
源代码可在以下网址免费获取:https://github.com/mskspi/PathCNN。