Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 41083, Hunan, China.
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae165.
Studying the molecular heterogeneity of cancer is essential for achieving personalized therapy. At the same time, understanding the biological processes that drive cancer development can lead to the identification of valuable therapeutic targets. Therefore, achieving accurate and interpretable clinical predictions requires paramount attention to thoroughly characterizing patients at both the molecular and biological pathway levels.
Here, we present GraphPath, a biological knowledge-driven graph neural network with multi-head self-attention mechanism that implements the pathway-pathway interaction network. We train GraphPath to classify the cancer status of patients with prostate cancer based on their multi-omics profiling. Experiment results show that our method outperforms P-NET and other baseline methods. Besides, two external cohorts are used to validate that the model can be generalized to unseen samples with adequate predictive performance. We reduce the dimensionality of latent pathway embeddings and visualize corresponding classes to further demonstrate the optimal performance of the model. Additionally, since GraphPath's predictions are interpretable, we identify target cancer-associated pathways that significantly contribute to the model's predictions. Such a robust and interpretable model has the potential to greatly enhance our understanding of cancer's biological mechanisms and accelerate the development of targeted therapies.
研究癌症的分子异质性对于实现个性化治疗至关重要。同时,了解驱动癌症发展的生物学过程可以导致有价值的治疗靶点的识别。因此,要实现准确且可解释的临床预测,就需要高度关注在分子和生物通路水平上彻底描述患者。
在这里,我们提出了 GraphPath,这是一种具有多头自注意力机制的生物知识驱动的图神经网络,它实现了通路-通路相互作用网络。我们训练 GraphPath 来根据前列腺癌患者的多组学分析来对其癌症状态进行分类。实验结果表明,我们的方法优于 P-NET 和其他基线方法。此外,使用两个外部队列来验证该模型可以对具有足够预测性能的未见样本进行泛化。我们降低了潜在通路嵌入的维度,并可视化了相应的类别,以进一步证明模型的最佳性能。此外,由于 GraphPath 的预测是可解释的,我们确定了对模型预测有显著贡献的目标癌症相关通路。这种稳健且可解释的模型有可能极大地增强我们对癌症生物学机制的理解,并加速靶向治疗的发展。