Suppr超能文献

TC-DTA:基于 Transformer 和卷积神经网络的药物-靶标结合亲和力预测。

TC-DTA: Predicting Drug-Target Binding Affinity With Transformer and Convolutional Neural Networks.

出版信息

IEEE Trans Nanobioscience. 2024 Oct;23(4):572-578. doi: 10.1109/TNB.2024.3441590. Epub 2024 Oct 15.

Abstract

Bioinformatics is a rapidly evolving field that applies computational methods to analyze and interpret biological data. A key task in bioinformatics is identifying novel drug-target interactions (DTIs), which plays a crucial role in drug discovery. Most computational approaches treat DTI prediction as a binary classification problem, determining whether drug-target pairs interact. However, with the growing availability of drug-target binding affinity data, this binary task can be reframed as a regression problem focused on drug-target affinity (DTA). DTA quantifies the strength of drug-target binding, offering more detailed insights than DTI and serving as a valuable tool for virtual screening in drug discovery. Accurately predicting compound interactions with targets can accelerate the drug development process. In this study, we introduce a deep learning model named TC-DTA for DTA prediction, leveraging convolutional neural networks (CNN) and the encoder module of the transformer architecture. We begin by extracting raw drug SMILES strings and protein amino acid sequences from the dataset, which are then represented using various encoding methods. Subsequently, we employ CNN and the transformer's encoder module to extract features from the drug SMILES strings and protein sequences, respectively. Finally, the feature information is concatenated and input into a multi-layer perceptron to predict binding affinity scores. We evaluated our model on two benchmark DTA datasets, Davis and KIBA, comparing it with methods such as KronRLS, SimBoost, and DeepDTA. Our model, TC-DTA, outperformed these baseline methods based on evaluation metrics like Mean Squared Error (MSE), Concordance Index (CI), and Regression towards the Mean Index ( r ). These results highlight the effectiveness of the Transformer's encoder and CNN in extracting meaningful representations from sequences, thereby enhancing DTA prediction accuracy. This deep learning model can accelerate drug discovery by identifying drug candidates with high binding affinity to specific targets. Compared to traditional methods, machine learning technology offers a more effective and efficient approach to drug discovery.

摘要

生物信息学是一个快速发展的领域,它应用计算方法来分析和解释生物数据。生物信息学的一个关键任务是识别新的药物-靶标相互作用(DTI),这在药物发现中起着至关重要的作用。大多数计算方法将 DTI 预测视为二分类问题,确定药物-靶标对是否相互作用。然而,随着药物-靶标结合亲和力数据的日益丰富,这个二元任务可以重新定义为一个回归问题,重点是药物-靶标亲和力(DTA)。DTA 量化了药物-靶标结合的强度,比 DTI 提供了更详细的见解,并作为药物发现中虚拟筛选的有价值工具。准确预测化合物与靶标的相互作用可以加速药物开发过程。在这项研究中,我们引入了一种名为 TC-DTA 的深度学习模型,用于 DTA 预测,利用卷积神经网络(CNN)和转换器架构的编码器模块。我们首先从数据集中提取原始药物 SMILES 字符串和蛋白质氨基酸序列,然后使用各种编码方法表示这些字符串和序列。接下来,我们使用 CNN 和转换器的编码器模块分别从药物 SMILES 字符串和蛋白质序列中提取特征。最后,将特征信息串联起来并输入到多层感知机中,以预测结合亲和力评分。我们在两个基准 DTA 数据集 Davis 和 KIBA 上评估了我们的模型,并与 KronRLS、SimBoost 和 DeepDTA 等方法进行了比较。我们的模型 TC-DTA 在评估指标如均方误差(MSE)、一致性指数(CI)和回归均值指数(r)方面优于这些基线方法。这些结果突出了转换器的编码器和 CNN 从序列中提取有意义表示的有效性,从而提高了 DTA 预测的准确性。这种深度学习模型可以通过识别与特定靶标具有高结合亲和力的药物候选物来加速药物发现。与传统方法相比,机器学习技术为药物发现提供了一种更有效、更高效的方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验