School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China.
Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China.
Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i158-i167. doi: 10.1093/bioinformatics/btad261.
Synthetic lethality (SL) is a promising strategy for anticancer therapy, as inhibiting SL partners of genes with cancer-specific mutations can selectively kill the cancer cells without harming the normal cells. Wet-lab techniques for SL screening have issues like high cost and off-target effects. Computational methods can help address these issues. Previous machine learning methods leverage known SL pairs, and the use of knowledge graphs (KGs) can significantly enhance the prediction performance. However, the subgraph structures of KG have not been fully explored. Besides, most machine learning methods lack interpretability, which is an obstacle for wide applications of machine learning to SL identification.
We present a model named KR4SL to predict SL partners for a given primary gene. It captures the structural semantics of a KG by efficiently constructing and learning from relational digraphs in the KG. To encode the semantic information of the relational digraphs, we fuse textual semantics of entities into propagated messages and enhance the sequential semantics of paths using a recurrent neural network. Moreover, we design an attentive aggregator to identify critical subgraph structures that contribute the most to the SL prediction as explanations. Extensive experiments under different settings show that KR4SL significantly outperforms all the baselines. The explanatory subgraphs for the predicted gene pairs can unveil prediction process and mechanisms underlying synthetic lethality. The improved predictive power and interpretability indicate that deep learning is practically useful for SL-based cancer drug target discovery.
The source code is freely available at https://github.com/JieZheng-ShanghaiTech/KR4SL.
合成致死性 (SL) 是一种很有前途的抗癌治疗策略,因为抑制具有癌症特异性突变的基因的 SL 伙伴可以选择性地杀死癌细胞而不伤害正常细胞。SL 筛选的湿实验室技术存在成本高和脱靶效应等问题。计算方法可以帮助解决这些问题。以前的机器学习方法利用已知的 SL 对,而知识图 (KG) 的使用可以显著提高预测性能。然而,KG 的子图结构尚未得到充分探索。此外,大多数机器学习方法缺乏可解释性,这是机器学习在 SL 识别中的广泛应用的一个障碍。
我们提出了一个名为 KR4SL 的模型,用于预测给定主基因的 SL 伙伴。它通过有效地构建和学习 KG 中的关系有向图来捕获 KG 的结构语义。为了对关系有向图的语义信息进行编码,我们将实体的文本语义融合到传播的消息中,并使用循环神经网络增强路径的序列语义。此外,我们设计了一个注意聚合器,以识别对 SL 预测贡献最大的关键子图结构作为解释。在不同设置下的广泛实验表明,KR4SL 明显优于所有基线。预测基因对的解释性子图可以揭示合成致死性的预测过程和机制。提高的预测能力和可解释性表明,深度学习在基于 SL 的癌症药物靶点发现方面具有实际意义。
源代码可在 https://github.com/JieZheng-ShanghaiTech/KR4SL 上免费获得。