Liu Sizhe, Liu Yuchen, Xu Haofeng, Xia Jun, Li Stan Z
Thomas Lord Department of Computer Science, University of Southern California, Los Angeles, CA 90089, United States.
Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, United States.
Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf011.
Drug-target interaction (DTI) prediction is crucial for drug discovery, significantly reducing costs and time in experimental searches across vast drug compound spaces. While deep learning has advanced DTI prediction accuracy, challenges remain: (i) existing methods often lack generalizability, with performance dropping significantly on unseen proteins and cross-domain settings; and (ii) current molecular relational learning often overlooks subpocket-level interactions, which are vital for a detailed understanding of binding sites.
We introduce SP-DTI, a subpocket-informed transformer model designed to address these challenges through: (i) detailed subpocket analysis using the Cavity Identification and Analysis Routine for interaction modeling at both global and local levels, and (ii) integration of pre-trained language models into graph neural networks to encode drugs and proteins, enhancing generalizability to unlabeled data. Benchmark evaluations show that SP-DTI consistently outperforms state-of-the-art models, achieving an area under the receiver operating characteristic curve of 0.873 in unseen protein settings, an 11% improvement over the best baseline.
The model scripts are available at https://github.com/Steven51516/SP-DTI.
药物-靶点相互作用(DTI)预测对于药物发现至关重要,可显著降低在庞大药物化合物空间中进行实验搜索的成本和时间。虽然深度学习提高了DTI预测的准确性,但挑战依然存在:(i)现有方法通常缺乏通用性,在未见蛋白质和跨域设置上性能显著下降;(ii)当前的分子关系学习常常忽略亚口袋水平的相互作用,而这对于详细理解结合位点至关重要。
我们引入了SP-DTI,这是一种基于亚口袋信息的变压器模型,旨在通过以下方式应对这些挑战:(i)使用腔识别和分析程序进行详细的亚口袋分析,以便在全局和局部层面进行相互作用建模;(ii)将预训练语言模型集成到图神经网络中,对药物和蛋白质进行编码,提高对未标记数据的通用性。基准评估表明,SP-DTI始终优于现有最佳模型,在未见蛋白质设置下,受试者工作特征曲线下面积达到0.873,比最佳基线提高了11%。