Piazza Lisa, Di Stefano Miriana, Poles Clarissa, Bononi Giulia, Poli Giulio, Renzi Gioele, Galati Salvatore, Giordano Antonio, Macchia Marco, Carta Fabrizio, Supuran Claudiu T, Tuccinardi Tiziano
Department of Pharmacy, University of Pisa, 56126 Pisa, Italy.
Telethon Institute of Genetics and Medicine, 80078 Naples, Italy.
Pharmaceuticals (Basel). 2025 Jul 5;18(7):1007. doi: 10.3390/ph18071007.
Human carbonic anhydrases (hCAs) are metalloenzymes involved in essential physiological processes, and their selective inhibition holds therapeutic potential across a wide range of disorders. However, the high degree of structural similarity among isoforms poses a significant challenge for the design of selective inhibitors. In this work, we present a machine learning (ML)-based platform for the isoform-specific prediction and profiling of small molecules targeting hCA I, II, IX, and XII. By integrating four molecular representations with four ML algorithms, we built 64 classification models, each extensively optimized and validated. The best-performing models for each isoform were applied in a virtual screening campaign for ~2 million compounds. Following a multi-step refinement process, 12 candidates were identified, purchased, and experimentally tested. Several compounds showed potent inhibitory activity in the nanomolar to submicromolar range, with selectivity profiles across the isoforms. To gain mechanistic insights, SHAP-based feature importance analysis and molecular docking supported by molecular dynamics simulations were employed, highlighting the structural determinants of the predicted activity. This study demonstrates the effectiveness of integrating ML, cheminformatics, and experimental validation to accelerate the discovery of selective carbonic anhydrase inhibitors and provides a generalizable framework for activity profiling across enzyme isoforms.
人类碳酸酐酶(hCAs)是参与基本生理过程的金属酶,其选择性抑制在多种疾病中具有治疗潜力。然而,同工型之间高度的结构相似性对选择性抑制剂的设计构成了重大挑战。在这项工作中,我们提出了一个基于机器学习(ML)的平台,用于对靶向hCA I、II、IX和XII的小分子进行同工型特异性预测和分析。通过将四种分子表示与四种ML算法相结合,我们构建了64个分类模型,每个模型都经过广泛优化和验证。将每种同工型的最佳性能模型应用于对约200万种化合物的虚拟筛选活动。经过多步优化过程,鉴定出12种候选化合物,进行购买并进行实验测试。几种化合物在纳摩尔至亚微摩尔范围内显示出强效抑制活性,且对不同同工型具有选择性。为了获得机理见解,采用了基于SHAP的特征重要性分析以及分子动力学模拟支持的分子对接,突出了预测活性的结构决定因素。这项研究证明了整合ML、化学信息学和实验验证以加速选择性碳酸酐酶抑制剂发现的有效性,并为跨酶同工型的活性分析提供了一个可推广的框架。