Bian Jilong, Zhang Xi, Zhang Xiying, Xu Dali, Wang Guohua
College of information and Computer Engineering, Northeast Forestry University, 150004, Harbin, China.
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad082.
Accurate and effective drug-target interaction (DTI) prediction can greatly shorten the drug development lifecycle and reduce the cost of drug development. In the deep-learning-based paradigm for predicting DTI, robust drug and protein feature representations and their interaction features play a key role in improving the accuracy of DTI prediction. Additionally, the class imbalance problem and the overfitting problem in the drug-target dataset can also affect the prediction accuracy, and reducing the consumption of computational resources and speeding up the training process are also critical considerations. In this paper, we propose shared-weight-based MultiheadCrossAttention, a precise and concise attention mechanism that can establish the association between target and drug, making our models more accurate and faster. Then, we use the cross-attention mechanism to construct two models: MCANet and MCANet-B. In MCANet, the cross-attention mechanism is used to extract the interaction features between drugs and proteins for improving the feature representation ability of drugs and proteins, and the PolyLoss loss function is applied to alleviate the overfitting problem and the class imbalance problem in the drug-target dataset. In MCANet-B, the robustness of the model is improved by combining multiple MCANet models and prediction accuracy further increases. We train and evaluate our proposed methods on six public drug-target datasets and achieve state-of-the-art results. In comparison with other baselines, MCANet saves considerable computational resources while maintaining accuracy in the leading position; however, MCANet-B greatly improves prediction accuracy by combining multiple models while maintaining a balance between computational resource consumption and prediction accuracy.
准确有效的药物-靶点相互作用(DTI)预测能够极大地缩短药物开发生命周期并降低药物开发成本。在基于深度学习的DTI预测范式中,强大的药物和蛋白质特征表示及其相互作用特征在提高DTI预测准确性方面起着关键作用。此外,药物-靶点数据集中的类别不平衡问题和过拟合问题也会影响预测准确性,减少计算资源消耗和加快训练过程也是至关重要的考虑因素。在本文中,我们提出了基于共享权重的多头交叉注意力机制,这是一种精确简洁的注意力机制,能够建立靶点与药物之间的关联,使我们的模型更加准确和快速。然后,我们使用交叉注意力机制构建了两个模型:MCANet和MCANet-B。在MCANet中,交叉注意力机制用于提取药物和蛋白质之间的相互作用特征,以提高药物和蛋白质的特征表示能力,并且应用PolyLoss损失函数来缓解药物-靶点数据集中的过拟合问题和类别不平衡问题。在MCANet-B中,通过组合多个MCANet模型提高了模型的鲁棒性,预测准确性进一步提高。我们在六个公共药物-靶点数据集上对所提出的方法进行了训练和评估,并取得了最优结果。与其他基线方法相比,MCANet在保持准确性处于领先地位的同时节省了大量计算资源;然而,MCANet-B通过组合多个模型极大地提高了预测准确性,同时在计算资源消耗和预测准确性之间保持了平衡。