Wang Debby D, Xie Haoran, Yan Hong
Institute of Medical and Information Engineering, School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China.
Department of Computing and Decision Sciences, Lingnan University, Tuen Mun, Hong Kong.
Bioinformatics. 2021 Sep 9;37(17):2570-2579. doi: 10.1093/bioinformatics/btab132.
Reliable predictive models of protein-ligand binding affinity are required in many areas of biomedical research. Accurate prediction based on current descriptors or molecular fingerprints (FPs) remains a challenge. We develop novel interaction FPs (IFPs) to encode protein-ligand interactions and use them to improve the prediction.
Proteo-chemometrics IFPs (PrtCmm IFPs) formed by combining extended connectivity fingerprints (ECFPs) with the proteo-chemometrics concept. Combining PrtCmm IFPs with machine-learning models led to efficient scoring models, which were validated on the PDBbind v2019 core set and CSAR-HiQ sets. The PrtCmm IFP Score outperformed several other models in predicting protein-ligand binding affinities. Besides, conventional ECFPs were simplified to generate new IFPs, which provided consistent but faster predictions. The relationship between the base atom properties of ECFPs and the accuracy of predictions was also investigated.
PrtCmm IFP has been implemented in the IFP Score Toolkit on github (https://github.com/debbydanwang/IFPscore).
Supplementary data are available at Bioinformatics online.
生物医学研究的许多领域都需要可靠的蛋白质-配体结合亲和力预测模型。基于当前描述符或分子指纹(FPs)进行准确预测仍然是一项挑战。我们开发了新型相互作用指纹(IFPs)来编码蛋白质-配体相互作用,并利用它们来改进预测。
通过将扩展连接性指纹(ECFPs)与蛋白质化学计量学概念相结合形成蛋白质化学计量学IFPs(PrtCmm IFPs)。将PrtCmm IFPs与机器学习模型相结合产生了高效的评分模型,这些模型在PDBbind v2019核心集和CSAR-HiQ集上得到了验证。PrtCmm IFP评分在预测蛋白质-配体结合亲和力方面优于其他几种模型。此外,对传统的ECFPs进行简化以生成新的IFPs,其提供了一致但更快的预测。还研究了ECFPs的基本原子性质与预测准确性之间的关系。
PrtCmm IFP已在github上的IFP评分工具包(https://github.com/debbydanwang/IFPscore)中实现。
补充数据可在《生物信息学》在线获取。