Herrington Noah B, Stein David, Li Yan Chak, Pandey Gaurav, Schlessinger Avner
bioRxiv. 2023 Sep 2:2023.08.31.555779. doi: 10.1101/2023.08.31.555779.
Protein kinase function and interactions with drugs are controlled in part by the movement of the DFG and ɑC-Helix motifs, which enable kinases to adopt various conformational states. Small molecule ligands elicit therapeutic effects with distinct selectivity profiles and residence times that often depend on the kinase conformation(s) they bind. However, the limited availability of experimentally determined structural data for kinases in inactive states restricts drug discovery efforts for this major protein family. Modern AI-based structural modeling methods hold potential for exploring the previously experimentally uncharted druggable conformational space for kinases. Here, we first evaluated the currently explored conformational space of kinases in the PDB and models generated by AlphaFold2 (AF2) (1) and ESMFold (2), two prominent AI-based structure prediction methods. We then investigated AF2's ability to predict kinase structures in different conformations at various multiple sequence alignment (MSA) depths, based on this parameter's ability to explore conformational diversity. Our results showed a bias within the PDB and predicted structural models generated by AF2 and ESMFold toward structures of kinases in the active state over alternative conformations, particularly those conformations controlled by the DFG motif. Finally, we demonstrate that predicting kinase structures using AF2 at lower MSA depths allows the exploration of the space of these alternative conformations, including identifying previously unobserved conformations for 398 kinases. The results of our analysis of structural modeling by AF2 create a new avenue for the pursuit of new therapeutic agents against a notoriously difficult-to-target family of proteins.
Greater abundance of kinase structural data in inactive conformations, currently lacking in structural databases, would improve our understanding of how protein kinases function and expand drug discovery and development for this family of therapeutic targets. Modern approaches utilizing artificial intelligence and machine learning have potential for efficiently capturing novel protein conformations. We provide evidence for a bias within AlphaFold2 and ESMFold to predict structures of kinases in their active states, similar to their overrepresentation in the PDB. We show that lowering the AlphaFold2 algorithm's multiple sequence alignment depth can help explore kinase conformational space more broadly. It can also enable the prediction of hundreds of kinase structures in novel conformations, many of whose models are likely viable for drug discovery.
蛋白激酶的功能以及与药物的相互作用部分受DFG和αC-螺旋基序的移动控制,这些基序使激酶能够采取各种构象状态。小分子配体以不同的选择性特征和停留时间产生治疗效果,这通常取决于它们所结合的激酶构象。然而,非活性状态下激酶的实验确定结构数据有限,限制了针对这一主要蛋白家族的药物发现工作。基于现代人工智能的结构建模方法有潜力探索激酶以前实验未涉及的可成药构象空间。在此,我们首先评估了蛋白质数据银行(PDB)中目前探索的激酶构象空间以及由两种著名的基于人工智能的结构预测方法AlphaFold2(AF2)(1)和ESMFold(2)生成的模型。然后,基于该参数探索构象多样性的能力,我们研究了AF2在不同多序列比对(MSA)深度下预测不同构象激酶结构的能力。我们的结果表明,PDB以及由AF2和ESMFold生成的预测结构模型存在偏向,更倾向于活性状态激酶的结构,而非其他构象,特别是那些由DFG基序控制的构象。最后,我们证明在较低MSA深度下使用AF2预测激酶结构能够探索这些替代构象的空间,包括识别398种激酶以前未观察到的构象。我们对AF2结构建模的分析结果为针对这一 notoriously difficult-to-target蛋白家族寻找新治疗药物开辟了一条新途径。
结构数据库目前缺乏非活性构象中更丰富的激酶结构数据,这将增进我们对蛋白激酶功能的理解,并扩大针对这一治疗靶点家族的药物发现和开发。利用人工智能和机器学习的现代方法有潜力有效捕捉新的蛋白质构象。我们提供证据表明,AlphaFold2和ESMFold存在偏向,倾向于预测激酶的活性状态结构,类似于它们在PDB中的过度代表性。我们表明,降低AlphaFold2算法的多序列比对深度有助于更广泛地探索激酶构象空间。它还能够预测数百种处于新构象的激酶结构,其中许多模型可能对药物发现可行。