Wang Yang, Zhang Zuxian, Piao Chenghong, Huang Ying, Zhang Yihan, Zhang Chi, Lu Yu-Jing, Liu Dongning
School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006 China.
School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China.
Health Inf Sci Syst. 2023 Sep 2;11(1):42. doi: 10.1007/s13755-023-00243-w. eCollection 2023 Dec.
Drug-target interaction (DTI) is a vital drug design strategy that plays a significant role in many processes of complex diseases and cellular events. In the face of challenges such as extensive protein data and experimental costs, it is suggested to apply bioinformatics approaches to exploit potential interactions to design new targeted medications. Different data and interaction types bring difficulties to study involving incompatible and heterology formats. The analysis of drug-target interactions in a comprehensive and unified model is a significant challenge.
Here, we propose a general method for predicting interactions between small-molecule drugs and protein targets, Large-scale Drug target Screening Convolutional Neural Network (LDS-CNN), which used unified encoding to achieve the calculation of the different data formats in an integrated model to realize feature abstraction and potential object prediction.
On 898,412 interaction data involving 1683 small-molecule compounds and 14,350 human proteins from 8.8 billion records, the proposed method achieved an area under the curve (AUC) of 0.96, an area under the precision-recall curve (AUPRC) of 0.95, and an accuracy of 90.13%. The experimental results illustrated that the proposed method attained high accuracy on the test set, indicating its high predictive ability in drug-target interaction prediction. LDS-CNN is effective for the prediction of large-scale datasets and datasets composed of data with different formats.
In this study, we propose a DTI prediction method to solve the problems of unified encoding of large-scale data in multiple formats. It provides a feasible way to efficiently abstract the features among different types of drug-related data, thus reducing experimental costs and time consumption. The proposed method can be used to identify potential drug targets and candidates for the treatment of complex diseases. This work provides a reference for DTI to process large-scale data and different formats with deep learning methods and provides certain suggestions for future research.
药物-靶点相互作用(DTI)是一种重要的药物设计策略,在许多复杂疾病过程和细胞事件中发挥着重要作用。面对蛋白质数据海量和实验成本高等挑战,建议应用生物信息学方法挖掘潜在相互作用以设计新型靶向药物。不同的数据和相互作用类型给涉及不兼容和异源格式的研究带来困难。在一个全面统一的模型中分析药物-靶点相互作用是一项重大挑战。
在此,我们提出一种预测小分子药物与蛋白质靶点之间相互作用的通用方法,即大规模药物靶点筛选卷积神经网络(LDS-CNN),它使用统一编码在一个集成模型中实现对不同数据格式的计算,以实现特征提取和潜在对象预测。
在所提出的方法中,在来自88亿条记录的涉及1683种小分子化合物和14350种人类蛋白质的898412条相互作用数据上,曲线下面积(AUC)达到0.96,精确率-召回率曲线下面积(AUPRC)为0.95,准确率为90.13%。实验结果表明,所提出的方法在测试集上达到了高精度,表明其在药物-靶点相互作用预测方面具有很高的预测能力。LDS-CNN对于大规模数据集以及由不同格式数据组成的数据集的预测是有效的。
在本研究中,我们提出了一种DTI预测方法来解决多种格式大规模数据的统一编码问题。它提供了一种可行的方法来有效提取不同类型药物相关数据之间的特征,从而降低实验成本和时间消耗。所提出的方法可用于识别潜在的药物靶点和治疗复杂疾病的候选药物。这项工作为DTI利用深度学习方法处理大规模数据和不同格式提供了参考,并为未来研究提供了一定的建议。