Suppr超能文献

DeepAffinity:通过统一的递归和卷积神经网络实现化合物-蛋白质亲和力的可解释深度学习。

DeepAffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks.

机构信息

Department of Electrical and Computer Engineering, College Station, TX, USA.

TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, College Station, TX, USA.

出版信息

Bioinformatics. 2019 Sep 15;35(18):3329-3338. doi: 10.1093/bioinformatics/btz111.

Abstract

MOTIVATION

Drug discovery demands rapid quantification of compound-protein interaction (CPI). However, there is a lack of methods that can predict compound-protein affinity from sequences alone with high applicability, accuracy and interpretability.

RESULTS

We present a seamless integration of domain knowledges and learning-based approaches. Under novel representations of structurally annotated protein sequences, a semi-supervised deep learning model that unifies recurrent and convolutional neural networks has been proposed to exploit both unlabeled and labeled data, for jointly encoding molecular representations and predicting affinities. Our representations and models outperform conventional options in achieving relative error in IC50 within 5-fold for test cases and 20-fold for protein classes not included for training. Performances for new protein classes with few labeled data are further improved by transfer learning. Furthermore, separate and joint attention mechanisms are developed and embedded to our model to add to its interpretability, as illustrated in case studies for predicting and explaining selective drug-target interactions. Lastly, alternative representations using protein sequences or compound graphs and a unified RNN/GCNN-CNN model using graph CNN (GCNN) are also explored to reveal algorithmic challenges ahead.

AVAILABILITY AND IMPLEMENTATION

Data and source codes are available at https://github.com/Shen-Lab/DeepAffinity.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

药物发现需要快速定量化合物-蛋白质相互作用(CPI)。然而,缺乏能够仅从序列准确且可解释地预测化合物-蛋白质亲和力的方法,具有高适用性、准确性和可解释性。

结果

我们提出了一种领域知识和基于学习的方法的无缝集成。在结构注释的蛋白质序列的新颖表示下,提出了一种统一递归和卷积神经网络的半监督深度学习模型,以利用未标记和标记数据,共同编码分子表示并预测亲和力。我们的表示和模型在实现相对误差方面优于传统方法,在测试案例中达到 IC50 的 5 倍以内,在未包含用于训练的蛋白质类别的 20 倍以内。通过迁移学习进一步提高了具有少量标记数据的新蛋白质类别的性能。此外,还开发并嵌入了单独和联合注意机制到我们的模型中,以提高其可解释性,如在预测和解释选择性药物-靶标相互作用的案例研究中所示。最后,还探索了使用蛋白质序列或化合物图的替代表示以及使用图卷积神经网络(GCNN)的统一 RNN/GCNN-CNN 模型,以揭示未来的算法挑战。

可用性和实现

数据和源代码可在 https://github.com/Shen-Lab/DeepAffinity 上获得。

补充信息

补充数据可在生物信息学在线获得。

相似文献

8
DeepDTA: deep drug-target binding affinity prediction.深度 DTA:深度药物-靶标结合亲和力预测。
Bioinformatics. 2018 Sep 1;34(17):i821-i829. doi: 10.1093/bioinformatics/bty593.
9
A deep learning architecture for metabolic pathway prediction.一种用于代谢途径预测的深度学习架构。
Bioinformatics. 2020 Apr 15;36(8):2547-2553. doi: 10.1093/bioinformatics/btz954.

引用本文的文献

本文引用的文献

6
A comprehensive map of molecular drug targets.分子药物靶点综合图谱。
Nat Rev Drug Discov. 2017 Jan;16(1):19-34. doi: 10.1038/nrd.2016.230. Epub 2016 Dec 2.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验