Department of Biology, Johns Hopkins University, Baltimore, MD 21218.
Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218.
Proc Natl Acad Sci U S A. 2020 Sep 22;117(38):23606-23616. doi: 10.1073/pnas.1921473117. Epub 2020 Sep 8.
Phosphorylation sites are hyperabundant in the eukaryotic disordered proteome, suggesting that conformational fluctuations play a major role in determining to what extent a kinase interacts with a particular substrate. In biophysical terms, substrate selectivity may be determined not just by the structural-chemical complementarity between the kinase and its protein substrates but also by the free energy difference between the conformational ensembles that are, or are not, recognized by the kinase. To test this hypothesis, we developed a statistical-thermodynamics-based informatics framework, which allows us to probe for the contribution of equilibrium fluctuations to phosphorylation, as evaluated by the ability to predict Ser/Thr/Tyr phosphorylation sites in the disordered proteome. Essential to this framework is a decomposition of substrate sequence information into two types: vertical information encoding conserved kinase specificity motifs and horizontal information encoding substrate conformational equilibrium that is embedded, but often not apparent, within position-specific conservation patterns. We find not only that conformational fluctuations play a major role but also that they are the dominant contribution to substrate selectivity. In fact, the main substrate classifier distinguishing selectivity is the magnitude of change in local compaction of the disordered chain upon phosphorylation of these mostly singly phosphorylated sites. In addition to providing fundamental insights into the consequences of phosphorylation across the proteome, our approach provides a statistical-thermodynamic strategy for partitioning any sequence-based search into contributions from structural-chemical complementarity and those from changes in conformational equilibrium.
磷酸化位点在真核无序蛋白质组中极为丰富,这表明构象波动在很大程度上决定了激酶与特定底物相互作用的程度。从生物物理的角度来看,底物选择性可能不仅取决于激酶与其蛋白质底物之间的结构化学互补性,还取决于激酶识别或不识别的构象集合之间的自由能差异。为了验证这一假设,我们开发了一种基于统计热力学的信息学框架,该框架允许我们通过预测无序蛋白质组中丝氨酸/苏氨酸/酪氨酸磷酸化位点的能力来探测平衡波动对磷酸化的贡献。这个框架的关键是将底物序列信息分解为两种类型:垂直信息编码保守的激酶特异性基序,以及水平信息编码嵌入但通常不明显的底物构象平衡,它存在于位置特异性保守模式中。我们不仅发现构象波动起着重要作用,而且它们是底物选择性的主要贡献者。事实上,主要的区分选择性的底物分类器是这些主要单磷酸化位点磷酸化时无序链局部紧缩程度的变化幅度。除了为蛋白质组中磷酸化的结果提供基本的见解外,我们的方法还提供了一种统计热力学策略,可将任何基于序列的搜索分为结构化学互补性和构象平衡变化的贡献。