Motono Chie, Yanagisawa Keisuke, Koseki Jun, Imai Kenichiro
Cellular and Molecular Biotechnology Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 135-0064, Japan.
Integrated Research Center for Self-Care Technology (IRC-SCT), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 135-0064, Japan.
Int J Mol Sci. 2025 May 14;26(10):4710. doi: 10.3390/ijms26104710.
Cryptic sites, which are transient binding sites that emerge through protein conformational changes upon ligand binding, are valuable targets for drug discovery, particularly for allosteric modulators. However, identifying these sites remains challenging because they are often discovered serendipitously when both ligand-binding (holo) and ligand-free (apo) states are experimentally determined. Here, we introduce CrypTothML, a novel framework that integrates mixed-solvent molecular dynamics (MSMD) simulations and machine learning to predict cryptic sites accurately. CrypTothML first identifies hotspots through MSMD simulations using six chemically diverse probes (benzene, dimethyl-ether, phenol, methyl-imidazole, acetonitrile, and ethylene glycol). A machine learning model then ranks these hotspots based on their likelihood of being cryptic sites, incorporating both hotspot-derived and protein-specific features. Evaluation on a curated dataset demonstrated that CrypTothML outperforms recent machine learning-based methods, achieving an AUC-ROC of 0.88 and successfully identifying cryptic sites missed by other methods. Additionally, CrypTothML ranked cryptic sites as the top prediction more frequently than existing methods. This approach provides a powerful strategy for accelerating drug discovery and designing allosteric drugs.
隐秘位点是通过配体结合时蛋白质构象变化而出现的瞬时结合位点,是药物发现的重要靶点,尤其是对于变构调节剂而言。然而,识别这些位点仍然具有挑战性,因为它们通常是在通过实验确定配体结合(全酶)和无配体(脱辅基蛋白)状态时偶然发现的。在此,我们介绍了CrypTothML,这是一种将混合溶剂分子动力学(MSMD)模拟与机器学习相结合以准确预测隐秘位点的新型框架。CrypTothML首先使用六种化学性质不同的探针(苯、二甲醚、苯酚、甲基咪唑、乙腈和乙二醇)通过MSMD模拟识别热点。然后,一个机器学习模型根据这些热点成为隐秘位点的可能性对其进行排名,同时纳入热点衍生特征和蛋白质特异性特征。在一个经过整理的数据集上进行的评估表明,CrypTothML优于最近基于机器学习的方法,实现了0.88的AUC-ROC,并成功识别了其他方法遗漏的隐秘位点。此外,与现有方法相比,CrypTothML将隐秘位点排在最高预测位置的频率更高。这种方法为加速药物发现和设计变构药物提供了一种强大的策略。