Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, Jiangsu, China.
School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, Jiangsu, China.
Technol Health Care. 2023;31(S1):487-495. doi: 10.3233/THC-236042.
Protein-ligand binding affinity is of significant importance in structure-based drug design. Recently, the development of machine learning techniques has provided an efficient and accurate way to predict binding affinity. However, the prediction performance largely depends on how molecules are represented.
Different molecular descriptors are designed to capture different features. The study aims to identify the optimal circular fingerprints for predicting protein-ligand binding affinity with matched neural network architectures.
Extended-connectivity fingerprints (ECFP) and protein-ligand extended connectivity fingerprints (PLEC) encode circular atomic and bonding connectivity environments with the preference for intra- and inter-molecular features, respectively. Densely-connected neural networks are employed to map the circular fingerprints of protein-ligand complexes to binding affinitiesRESULTS:The performance of neural networks is sensitive to the parameters used for ECFP and PLEC fingerprints. The R2_score of the evaluated ECFP and PLEC fingerprints reaches 0.52 and 0.49, higher than that of the improperly set ECFP and PLEC fingerprints with R2_score of 0.45 and 0.38, respectively. Additionally, compared to the predictions from the standalone fingerprints, the ECFP+PLEC conjoint ones slightly improve the prediction accuracy with R2_score of approximately 0.55.
Both intra- and inter-molecular structural features encoded in the circular fingerprints contribute to the protein-ligand binding affinity. Optimizing the parameters of ECFP and PLEC can enhance performance. The conjoint fingerprint scheme can be generally extended to other molecular descriptors for enhanced feature engineering and improved predictive performance.
蛋白质-配体结合亲和力在基于结构的药物设计中具有重要意义。最近,机器学习技术的发展为预测结合亲和力提供了一种高效、准确的方法。然而,预测性能在很大程度上取决于分子的表示方式。
不同的分子描述符旨在捕捉不同的特征。本研究旨在确定最佳的圆形指纹,以使用匹配的神经网络架构预测蛋白质-配体结合亲和力。
扩展连接指纹(ECFP)和蛋白质-配体扩展连接指纹(PLEC)分别使用内部分子和分子间特征的偏好来编码圆形原子和键连接环境。密集连接神经网络用于将蛋白质-配体复合物的圆形指纹映射到结合亲和力上。
神经网络的性能对 ECFP 和 PLEC 指纹使用的参数敏感。评估的 ECFP 和 PLEC 指纹的 R2_score 分别达到 0.52 和 0.49,高于参数设置不当的 ECFP 和 PLEC 指纹的 R2_score(分别为 0.45 和 0.38)。此外,与来自独立指纹的预测相比,ECFP+PLEC 联合指纹略微提高了预测准确性,R2_score 约为 0.55。
圆形指纹中编码的内部分子和分子间结构特征都有助于蛋白质-配体结合亲和力。优化 ECFP 和 PLEC 的参数可以提高性能。联合指纹方案通常可以扩展到其他分子描述符,以进行增强的特征工程和提高预测性能。