DL-SMILES#：一种使用深度学习预测化合物蛋白亲和力的新型编码方案。

DL-SMILES#: A Novel Encoding Scheme for Predicting Compound Protein Affinity Using Deep Learning.

机构信息

College of Computer Science and Technology, China University of Petroleum (East China), Qingdao 266580, Shandong,China.

Department of Neurology Medicine, The Second Hospital, Cheeloo College of Medicine, Shandong University, Jinan 250033,China | College of Computer Science and Technology, China University of Petroleum (East China), Qingdao 266580, Shandong, China.

出版信息

Comb Chem High Throughput Screen. 2022;25(4):642-650. doi: 10.2174/1386207324666210219102728.

DOI:10.2174/1386207324666210219102728

PMID:33605851

Abstract

INTRODUCTION

Drug repositioning aims to screen drugs and therapeutic goals from approved drugs and abandoned compounds that have been identified as safe. This trend is changing the landscape of drug development and creating a model of drug repositioning for new drug development. In the recent decade, machine learning methods have been applied to predict the binding affinity of compound proteins, while deep learning is recently becoming prominent and achieving significant performances. Among the models, the way of representing the compounds is usually simple, which is the molecular fingerprints, i.e., a single SMILES string.

METHODS

In this study, we improve previous work by proposing a novel representing manner, named SMILES#, to recode the SMILES string. This approach takes into account the properties of compounds and achieves superior performance. After that, we propose a deep learning model that combines recurrent neural networks with a convolutional neural network with an attention mechanism, using unlabeled data and labeled data to jointly encode molecules and predict binding affinity.

RESULTS

Experimental results show that SMILES# with compound properties can effectively improve the accuracy of the model and reduce the RMS error on most data sets.

CONCLUSION

We used the method to verify the related and unrelated compounds with the same target, and the experimental results show the effectiveness of the method.

摘要

简介

药物重定位旨在从已确定安全的已批准药物和已废弃的化合物中筛选药物和治疗目标。这种趋势正在改变药物开发的格局，并为新药开发创造了药物重定位的模式。在最近十年中，机器学习方法已被应用于预测化合物蛋白质的结合亲和力，而深度学习最近变得突出并取得了显著的性能。在这些模型中，化合物的表示方式通常很简单，即分子指纹，即单个 SMILES 字符串。

方法

在这项研究中，我们通过提出一种新颖的表示方式 SMILES# 来改进以前的工作，以重新编码 SMILES 字符串。这种方法考虑了化合物的性质，从而实现了卓越的性能。之后，我们提出了一种深度学习模型，该模型结合了具有注意力机制的递归神经网络和卷积神经网络，使用未标记数据和标记数据共同对分子进行编码并预测结合亲和力。

结果

实验结果表明，具有化合物特性的 SMILES# 可以有效地提高模型的准确性并降低大多数数据集上的 RMS 误差。

结论

我们使用该方法验证了具有相同靶标的相关和不相关化合物，实验结果表明了该方法的有效性。

相似文献

DL-SMILES#: A Novel Encoding Scheme for Predicting Compound Protein Affinity Using Deep Learning.

Comb Chem High Throughput Screen. 2022;25(4):642-650. doi: 10.2174/1386207324666210219102728.

SSGraphCPI: A Novel Model for Predicting Compound-Protein Interactions Based on Deep Learning.

Int J Mol Sci. 2022 Mar 29;23(7):3780. doi: 10.3390/ijms23073780.

Convolutional neural network based on SMILES representation of compounds for detecting chemical motif.

BMC Bioinformatics. 2018 Dec 31;19(Suppl 19):526. doi: 10.1186/s12859-018-2523-5.

A Novel Molecular Representation Learning for Molecular Property Prediction with a Multiple SMILES-Based Augmentation.

Comput Intell Neurosci. 2022 Jan 28;2022:8464452. doi: 10.1155/2022/8464452. eCollection 2022.

AttentionDTA: Drug-Target Binding Affinity Prediction by Sequence-Based Deep Learning With Attention Mechanism.

IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):852-863. doi: 10.1109/TCBB.2022.3170365. Epub 2023 Apr 3.

DeepAffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks.

Bioinformatics. 2019 Sep 15;35(18):3329-3338. doi: 10.1093/bioinformatics/btz111.

Prediction of drug protein interactions based on variable scale characteristic pyramid convolution network.

Methods. 2023 Mar;211:42-47. doi: 10.1016/j.ymeth.2023.02.007. Epub 2023 Feb 15.

Learning to SMILES: BAN-based strategies to improve latent representation learning from molecules.

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab327.

Deep learning for retention time prediction in reversed-phase liquid chromatography.

J Chromatogr A. 2022 Feb 8;1664:462792. doi: 10.1016/j.chroma.2021.462792. Epub 2021 Dec 30.

BACPI: a bi-directional attention neural network for compound-protein interaction and binding affinity prediction.

Bioinformatics. 2022 Mar 28;38(7):1995-2002. doi: 10.1093/bioinformatics/btac035.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

DL-SMILES#：一种使用深度学习预测化合物蛋白亲和力的新型编码方案。

DL-SMILES#: A Novel Encoding Scheme for Predicting Compound Protein Affinity Using Deep Learning.

机构信息

出版信息

INTRODUCTION

METHODS

RESULTS

CONCLUSION

简介

方法

结果

结论

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献