Cimermancic Peter, Weinkam Patrick, Rettenmaier T Justin, Bichmann Leon, Keedy Daniel A, Woldeyes Rahel A, Schneidman-Duhovny Dina, Demerdash Omar N, Mitchell Julie C, Wells James A, Fraser James S, Sali Andrej
Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA; Graduate Group in Biological and Medical Informatics,University of California, San Francisco, San Francisco, CA 94158, USA.
Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA.
J Mol Biol. 2016 Feb 22;428(4):709-719. doi: 10.1016/j.jmb.2016.01.029. Epub 2016 Feb 5.
Many proteins have small-molecule binding pockets that are not easily detectable in the ligand-free structures. These cryptic sites require a conformational change to become apparent; a cryptic site can therefore be defined as a site that forms a pocket in a holo structure, but not in the apo structure. Because many proteins appear to lack druggable pockets, understanding and accurately identifying cryptic sites could expand the set of drug targets. Previously, cryptic sites were identified experimentally by fragment-based ligand discovery and computationally by long molecular dynamics simulations and fragment docking. Here, we begin by constructing a set of structurally defined apo-holo pairs with cryptic sites. Next, we comprehensively characterize the cryptic sites in terms of their sequence, structure, and dynamics attributes. We find that cryptic sites tend to be as conserved in evolution as traditional binding pockets but are less hydrophobic and more flexible. Relying on this characterization, we use machine learning to predict cryptic sites with relatively high accuracy (for our benchmark, the true positive and false positive rates are 73% and 29%, respectively). We then predict cryptic sites in the entire structurally characterized human proteome (11,201 structures, covering 23% of all residues in the proteome). CryptoSite increases the size of the potentially "druggable" human proteome from ~40% to ~78% of disease-associated proteins. Finally, to demonstrate the utility of our approach in practice, we experimentally validate a cryptic site in protein tyrosine phosphatase 1B using a covalent ligand and NMR spectroscopy. The CryptoSite Web server is available at http://salilab.org/cryptosite.
许多蛋白质具有小分子结合口袋,这些口袋在无配体结构中不易检测到。这些隐藏位点需要构象变化才能显现出来;因此,隐藏位点可定义为在全酶结构中形成口袋,但在脱辅基结构中不形成口袋的位点。由于许多蛋白质似乎缺乏可成药口袋,了解并准确识别隐藏位点可能会扩大药物靶点的范围。以前,隐藏位点是通过基于片段的配体发现实验确定的,通过长时间分子动力学模拟和片段对接进行计算识别。在这里,我们首先构建一组具有隐藏位点的结构定义的脱辅基 - 全酶对。接下来,我们从序列、结构和动力学属性方面全面表征隐藏位点。我们发现隐藏位点在进化中往往与传统结合口袋一样保守,但疏水性较低且更灵活。基于这一特征,我们使用机器学习以相对较高的准确率预测隐藏位点(对于我们的基准,真阳性率和假阳性率分别为73%和29%)。然后,我们预测了整个结构表征的人类蛋白质组中的隐藏位点(11,201个结构,覆盖蛋白质组中所有残基的23%)。CryptoSite将潜在“可成药”的人类蛋白质组的规模从与疾病相关蛋白质的约40%增加到约78%。最后,为了证明我们方法在实际中的实用性,我们使用共价配体和核磁共振光谱对蛋白酪氨酸磷酸酶1B中的一个隐藏位点进行了实验验证。CryptoSite网络服务器可在http://salilab.org/cryptosite获取。