Evans Nathaniel J, Mills Gordon B, Wu Guanming, Song Xubo, McWeeney Shannon
Division of Bioinformatics and Computational Biomedicine, Department of Medical Informatics & Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, United States of America.
Division of Oncological Sciences Knight Cancer Institute, Oregon Health & Science University, Portland, OR, 97201, USA.
bioRxiv. 2024 Feb 29:2024.02.28.582164. doi: 10.1101/2024.02.28.582164.
Computational modeling of perturbation biology identifies relationships between molecular elements and cellular response, and an accurate understanding of these systems will support the full realization of precision medicine. Traditional deep learning, while often accurate in predicting response, is unlikely to capture the true sequence of involved molecular interactions. Our work is motivated by two assumptions: 1) Methods that encourage mechanistic prediction logic are likely to be more trustworthy, and 2) problem-specific algorithms are likely to outperform generic algorithms. We present an alternative to Graph Neural Networks (GNNs) termed (GSNN), which uses cell signaling knowledge, encoded as a graph data structure, to add inductive biases to deep learning. We apply our method to perturbation biology using the LINCS L1000 dataset and literature-curated molecular interactions. We demonstrate that GSNNs outperform baseline algorithms in several prediction tasks, including 1) perturbed expression, 2) cell viability of drug combinations, and 3) disease-specific drug prioritization. We also present a method called to explain GSNN predictions in a biologically interpretable form. This work has broad application in basic biological research and pre-clincal drug repurposing. Further refinement of these methods may produce trustworthy models of drug response suitable for use as clinical decision aids.
Our implementation of the GSNN method is available at https://github.com/nathanieljevans/GSNN. All data used in this work is publicly available.
扰动生物学的计算建模可识别分子元件与细胞反应之间的关系,而对这些系统的准确理解将有助于精准医学的全面实现。传统深度学习虽然在预测反应方面通常较为准确,但不太可能捕捉到所涉及分子相互作用的真实序列。我们的工作基于两个假设:1)鼓励机械预测逻辑的方法可能更值得信赖,2)特定问题的算法可能优于通用算法。我们提出了一种替代图神经网络(GNN)的方法,称为(GSNN),它使用编码为图数据结构的细胞信号知识,为深度学习添加归纳偏差。我们使用LINCS L1000数据集和文献整理的分子相互作用将我们的方法应用于扰动生物学。我们证明,GSNN在几个预测任务中优于基线算法,包括1)扰动表达,2)药物组合的细胞活力,以及3)疾病特异性药物优先级排序。我们还提出了一种称为的方法,以生物学可解释的形式解释GSNN预测。这项工作在基础生物学研究和临床前药物重新利用方面具有广泛应用。这些方法的进一步完善可能会产生适用于临床决策辅助的可靠药物反应模型。
我们的GSNN方法实现可在https://github.com/nathanieljevans/GSNN上获取。本工作中使用的所有数据均可公开获取。