Abdel-Basset Mohamed, Hawash Hossam, Elhoseny Mohamed, Chakrabortty Ripon K, Ryan Michael
Faculty of Computers and InformaticsZagazig University Zagazig 44519 Egypt.
Department of Computer ScienceCollege of Computer Information TechnologyAmerican University in the Emirates Dubai 503000 United Arab Emirates.
IEEE Access. 2020 Sep 15;8:170433-170451. doi: 10.1109/ACCESS.2020.3024238. eCollection 2020.
The rapid spread of novel coronavirus pneumonia (COVID-19) has led to a dramatically increased mortality rate worldwide. Despite many efforts, the rapid development of an effective vaccine for this novel virus will take considerable time and relies on the identification of drug-target (DT) interactions utilizing commercially available medication to identify potential inhibitors. Motivated by this, we propose a new framework, called DeepH-DTA, for predicting DT binding affinities for heterogeneous drugs. We propose a heterogeneous graph attention (HGAT) model to learn topological information of compound molecules and bidirectional ConvLSTM layers for modeling spatio-sequential information in simplified molecular-input line-entry system (SMILES) sequences of drug data. For protein sequences, we propose a squeezed-excited dense convolutional network for learning hidden representations within amino acid sequences; while utilizing advanced embedding techniques for encoding both kinds of input sequences. The performance of DeepH-DTA is evaluated through extensive experiments against cutting-edge approaches utilising two public datasets (Davis, and KIBA) which comprise eclectic samples of the kinase protein family and the pertinent inhibitors. DeepH-DTA attains the highest Concordance Index (CI) of 0.924 and 0.927 and also achieved a mean square error (MSE) of 0.195 and 0.111 on the Davis and KIBA datasets respectively. Moreover, a study using FDA-approved drugs from the Drug Bank database is performed using DeepH-DTA to predict the affinity scores of drugs against SARS-CoV-2 amino acid sequences, and the results show that that the model can predict some of the SARS-Cov-2 inhibitors that have been recently approved in many clinical studies.
新型冠状病毒肺炎(COVID-19)的迅速传播导致全球死亡率急剧上升。尽管付出了诸多努力,但要快速开发出针对这种新型病毒的有效疫苗仍需相当长的时间,且依赖于利用市售药物来识别药物靶点(DT)相互作用,以确定潜在抑制剂。受此启发,我们提出了一种名为DeepH-DTA的新框架,用于预测异质药物的DT结合亲和力。我们提出了一种异构图注意力(HGAT)模型来学习化合物分子的拓扑信息,并使用双向卷积长短期记忆(ConvLSTM)层对药物数据的简化分子输入线性条目系统(SMILES)序列中的时空序列信息进行建模。对于蛋白质序列,我们提出了一种挤压激励密集卷积网络来学习氨基酸序列中的隐藏表示;同时利用先进的嵌入技术对这两种输入序列进行编码。通过使用包含激酶蛋白家族和相关抑制剂的折衷样本的两个公共数据集(Davis和KIBA),针对前沿方法进行了广泛实验,对DeepH-DTA的性能进行了评估。DeepH-DTA在Davis和KIBA数据集上分别获得了最高的一致性指数(CI),分别为0.924和0.927,并且均方误差(MSE)分别为0.195和0.111。此外,使用Drug Bank数据库中FDA批准的药物进行了一项研究,利用DeepH-DTA预测药物对SARS-CoV-2氨基酸序列的亲和力得分,结果表明该模型可以预测一些最近在许多临床研究中获得批准的SARS-Cov-2抑制剂。