Univ Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Coimbra, Portugal.
IEETA, Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal.
Comput Biol Med. 2022 Aug;147:105772. doi: 10.1016/j.compbiomed.2022.105772. Epub 2022 Jun 21.
The accurate identification of Drug-Target Interactions (DTIs) remains a critical turning point in drug discovery and understanding of the binding process. Despite recent advances in computational solutions to overcome the challenges of in vitro and in vivo experiments, most of the proposed in silico-based methods still focus on binary classification, overlooking the importance of characterizing DTIs with unbiased binding strength values to properly distinguish primary interactions from those with off-targets. Moreover, several of these methods usually simplify the entire interaction mechanism, neglecting the joint contribution of the individual units of each binding component and the interacting substructures involved, and have yet to focus on more explainable and interpretable architectures. In this study, we propose an end-to-end Transformer-based architecture for predicting drug-target binding affinity (DTA) using 1D raw sequential and structural data to represent the proteins and compounds. This architecture exploits self-attention layers to capture the biological and chemical context of the proteins and compounds, respectively, and cross-attention layers to exchange information and capture the pharmacological context of the DTIs. The results show that the proposed architecture is effective in predicting DTA, achieving superior performance in both correctly predicting the value of interaction strength and being able to correctly discriminate the rank order of binding strength compared to state-of-the-art baselines. The combination of multiple Transformer-Encoders was found to result in robust and discriminative aggregate representations of the proteins and compounds for binding affinity prediction, in which the addition of a Cross-Attention Transformer-Encoder was identified as an important block for improving the discriminative power of these representations. Overall, this research study validates the applicability of an end-to-end Transformer-based architecture in the context of drug discovery, capable of self-providing different levels of potential DTI and prediction understanding due to the nature of the attention blocks. The data and source code used in this study are available at: https://github.com/larngroup/DTITR.
准确识别药物-靶标相互作用(DTIs)仍然是药物发现和理解结合过程的关键转折点。尽管最近在计算解决方案方面取得了进展,以克服体外和体内实验的挑战,但大多数提出的基于计算的方法仍然侧重于二进制分类,忽略了用无偏结合强度值来表征 DTIs 的重要性,以正确区分主要相互作用和与非靶标相互作用。此外,其中一些方法通常简化整个相互作用机制,忽略每个结合组件的单个单元以及涉及的相互作用子结构的联合贡献,并且尚未关注更具可解释性和可解释性的架构。在这项研究中,我们提出了一种基于端到端的 Transformer 架构,用于使用 1D 原始序列和结构数据来表示蛋白质和化合物,从而预测药物-靶标结合亲和力(DTA)。该架构利用自注意力层分别捕获蛋白质和化合物的生物和化学上下文,以及交叉注意力层来交换信息并捕获 DTIs 的药理学上下文。结果表明,所提出的架构在预测 DTA 方面是有效的,在正确预测相互作用强度值以及能够正确区分结合强度的等级顺序方面均优于最新的基准。发现多个 Transformer-Encoder 的组合导致了对蛋白质和化合物的稳健和有区别的聚合表示,用于结合亲和力预测,其中添加交叉注意力 Transformer-Encoder 被确定为提高这些表示的区分能力的重要块。总的来说,这项研究验证了基于端到端的 Transformer 架构在药物发现背景下的适用性,由于注意力块的性质,它能够自我提供不同水平的潜在 DTI 和预测理解。本研究中使用的数据和源代码可在:https://github.com/larngroup/DTITR 上获得。