Caniceiro Ana B, Amorim Ana M B, Rosário-Ferreira Nícia, Moreira Irina S
CNC-UC - Center for Neuroscience and Cell Biology, University of Coimbra, Rua Larga, Ed FMUC, Piso 1, 3004-504, Coimbra, Portugal.
CiBB - Centre for Innovative Biomedicine and Biotechnology, University of Coimbra, Rua Larga, Ed FMUC, Piso 1, 3004-504, Coimbra, Portugal.
J Cheminform. 2025 Jul 11;17(1):102. doi: 10.1186/s13321-025-01050-z.
G Protein-Coupled Receptors (GPCRs) are vital players in cellular signalling and key targets for drug discovery, especially within the GPCR-A17 subfamily, which is linked to various diseases. To address the growing need for effective treatments, the GPCR-A17 Modulator, Agonist, Antagonist Predictor (MAAP) was introduced as an advanced ensemble machine learning model that combines XGBoost, Random Forest, and LightGBM to predict the functional roles of agonists, antagonists, and modulators in GPCR-A17 interactions. The model was trained on a dataset of over 3,000 ligands (agonists, antagonists, and modulators) and 6,900 protein-ligand interactions, comprising all three ligand types, sourced from the Guide to Pharmacology, Therapeutic Target Database, and ChEMBL. It demonstrated a strong predictive performance, achieving F1 scores of 0.9179 and 0.7151, AUCs of 0.9766 and 0.8591, and specificities of 0.9703 and 0.8789, respectively, reflecting the overall performance across all classes in the testing and independent ligand validation datasets. A Ki-filtered subset of 4,274 interactions (where Ki is the inhibition constant that quantifies the ligand-binding affinity) improved the F1 scores to 0.9330 and 0.8267 for the testing and independent ligand datasets, respectively. By guiding experimental validation, GPCR-A17 MAAP accelerates drug discovery for various therapeutic targets. The code and data are available on GitHub ( https://github.com/MoreiraLAB/GPCR-A17-MAAP ).
G蛋白偶联受体(GPCRs)是细胞信号传导中的重要参与者,也是药物研发的关键靶点,尤其是在与多种疾病相关的GPCR - A17亚家族中。为了满足对有效治疗方法日益增长的需求,引入了GPCR - A17调节剂、激动剂、拮抗剂预测器(MAAP),这是一种先进的集成机器学习模型,它结合了XGBoost、随机森林和LightGBM来预测激动剂、拮抗剂和调节剂在GPCR - A17相互作用中的功能作用。该模型在一个包含3000多种配体(激动剂、拮抗剂和调节剂)以及6900种蛋白质 - 配体相互作用的数据集上进行训练,这些数据涵盖了所有三种配体类型,来源包括《药理学指南》《治疗靶点数据库》和ChEMBL。它展示了强大的预测性能,在测试数据集和独立配体验证数据集中,F1分数分别达到0.9179和0.7151,AUC分别为0.9766和0.8591,特异性分别为0.9703和0.8789,反映了所有类别的整体性能。一个经过Ki过滤的包含4274种相互作用的子集(其中Ki是量化配体结合亲和力的抑制常数),将测试数据集和独立配体数据集的F1分数分别提高到0.9330和0.8267。通过指导实验验证,GPCR - A17 MAAP加速了针对各种治疗靶点的药物研发。代码和数据可在GitHub上获取(https://github.com/MoreiraLAB/GPCR - A17 - MAAP)。