Department of Chemical and Biomolecular Engineering, University of Pennsylvania, Philadelphia, PA 19104-6315.
Graduate Group in Biochemistry and Molecular Biology, University of Pennsylvania, Philadelphia PA 19104-6073.
Proc Natl Acad Sci U S A. 2021 Mar 9;118(10). doi: 10.1073/pnas.2019132118.
Kinases play important roles in diverse cellular processes, including signaling, differentiation, proliferation, and metabolism. They are frequently mutated in cancer and are the targets of a large number of specific inhibitors. Surveys of cancer genome atlases reveal that kinase domains, which consist of 300 amino acids, can harbor numerous (150 to 200) single-point mutations across different patients in the same disease. This preponderance of mutations-some activating, some silent-in a known target protein make clinical decisions for enrolling patients in drug trials challenging since the relevance of the target and its drug sensitivity often depend on the mutational status in a given patient. We show through computational studies using molecular dynamics (MD) as well as enhanced sampling simulations that the experimentally determined activation status of a mutated kinase can be predicted effectively by identifying a hydrogen bonding fingerprint in the activation loop and the αC-helix regions, despite the fact that mutations in cancer patients occur throughout the kinase domain. In our study, we find that the predictive power of MD is superior to a purely data-driven machine learning model involving biochemical features that we implemented, even though MD utilized far fewer features (in fact, just one) in an unsupervised setting. Moreover, the MD results provide key insights into convergent mechanisms of activation, primarily involving differential stabilization of a hydrogen bond network that engages residues of the activation loop and αC-helix in the active-like conformation (in >70% of the mutations studied, regardless of the location of the mutation).
激酶在多种细胞过程中发挥重要作用,包括信号转导、分化、增殖和代谢。它们在癌症中经常发生突变,是大量特异性抑制剂的靶点。对癌症基因组图谱的调查显示,由 300 个氨基酸组成的激酶结构域在同一疾病的不同患者中可能存在大量(150 到 200 个)单点突变。在已知的靶蛋白中,这种突变的优势——有些是激活的,有些是沉默的——使得为招募患者参加药物试验做出临床决策具有挑战性,因为靶蛋白的相关性及其药物敏感性通常取决于特定患者的突变状态。我们通过使用分子动力学(MD)以及增强采样模拟的计算研究表明,尽管在癌症患者中突变发生在整个激酶结构域中,但通过识别激活环和αC-螺旋区域中的氢键指纹,可以有效地预测突变激酶的实验确定的激活状态。在我们的研究中,我们发现 MD 的预测能力优于涉及我们实施的生化特征的纯数据驱动机器学习模型,尽管 MD 在无监督设置中仅使用了更少的特征(实际上只有一个)。此外,MD 结果提供了关于激活趋同机制的关键见解,主要涉及激活环和αC-螺旋中残基的氢键网络的差异稳定化,这些残基以类似于活性的构象参与(在研究的突变中,超过 70%,无论突变的位置如何)。