Zheng Jie, Xiao Xuan, Qiu Wang-Ren
Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen, China.
Front Genet. 2022 Jun 8;13:859188. doi: 10.3389/fgene.2022.859188. eCollection 2022.
Drug-target interactions (DTIs) are regarded as an essential part of genomic drug discovery, and computational prediction of DTIs can accelerate to find the lead drug for the target, which can make up for the lack of time-consuming and expensive wet-lab techniques. Currently, many computational methods predict DTIs based on sequential composition or physicochemical properties of drug and target, but further efforts are needed to improve them. In this article, we proposed a new sequence-based method for accurately identifying DTIs. For target protein, we explore using pre-trained Bidirectional Encoder Representations from Transformers (BERT) to extract sequence features, which can provide unique and valuable pattern information. For drug molecules, Discrete Wavelet Transform (DWT) is employed to generate information from drug molecular fingerprints. Then we concatenate the feature vectors of the DTIs, and input them into a feature extraction module consisting of a batch-norm layer, rectified linear activation layer and linear layer, called BRL block and a Convolutional Neural Networks module to extract DTIs features further. Subsequently, a BRL block is used as the prediction engine. After optimizing the model based on contrastive loss and cross-entropy loss, it gave prediction accuracies of the target families of G Protein-coupled receptors, ion channels, enzymes, and nuclear receptors up to 90.1, 94.7, 94.9, and 89%, which indicated that the proposed method can outperform the existing predictors. To make it as convenient as possible for researchers, the web server for the new predictor is freely accessible at: https://bioinfo.jcu.edu.cn/dtibert or http://121.36.221.79/dtibert/. The proposed method may also be a potential option for other DITs.
药物-靶点相互作用(DTIs)被视为基因组药物发现的重要组成部分,DTIs的计算预测可以加速为靶点找到先导药物,这可以弥补耗时且昂贵的湿实验室技术的不足。目前,许多计算方法基于药物和靶点的序列组成或物理化学性质来预测DTIs,但仍需进一步努力改进。在本文中,我们提出了一种新的基于序列的方法来准确识别DTIs。对于靶蛋白,我们探索使用预训练的来自变换器的双向编码器表示(BERT)来提取序列特征,这可以提供独特且有价值的模式信息。对于药物分子,采用离散小波变换(DWT)从药物分子指纹中生成信息。然后我们连接DTIs的特征向量,并将它们输入到一个由批归一化层、修正线性激活层和线性层组成的特征提取模块,称为BRL块和一个卷积神经网络模块,以进一步提取DTIs特征。随后,使用BRL块作为预测引擎。基于对比损失和交叉熵损失对模型进行优化后,它对G蛋白偶联受体、离子通道、酶和核受体等靶标家族的预测准确率分别高达90.1%、94.7%、94.9%和89%,这表明所提出的方法优于现有预测器。为了让研究人员尽可能方便地使用,新预测器的网络服务器可通过以下网址免费访问:https://bioinfo.jcu.edu.cn/dtibert 或 http://121.36.221.79/dtibert/。所提出的方法也可能是其他药物-靶点相互作用研究的一个潜在选择。