Raisinghani Nishank, Parikh Vedant, Foley Brandon, Verkhivker Gennady
bioRxiv. 2024 Nov 6:2024.11.04.621947. doi: 10.1101/2024.11.04.621947.
Proteins often exist in multiple conformational states, influenced by the binding of ligands or substrates. The study of these states, particularly the apo (unbound) and holo (ligand-bound) forms, is crucial for understanding protein function, dynamics, and interactions. In the current study, we use AlphaFold2 that combines randomized alanine sequence masking with shallow multiple sequence alignment subsampling to expand the conformational diversity of the predicted structural ensembles and capture conformational changes between apo and holo protein forms. Using several well-established datasets of structurally diverse apo-holo protein pairs, the proposed approach enables robust predictions of apo and holo structures and conformational ensembles, while also displaying notably similar dynamics distributions. These observations are consistent with the view that the intrinsic dynamics of allosteric proteins is defined by the structural topology of the fold and favors conserved conformational motions driven by soft modes. Our findings support the notion that AlphaFold2 approaches can yield reasonable accuracy in predicting minor conformational adjustments between apo and holo states, especially for proteins with moderate localized changes upon ligand binding. However, for large, hinge-like domain movements, AlphaFold2 tends to predict the most stable domain orientation which is typically the apo form rather than the full range of functional conformations characteristic of the holo ensemble. These results indicate that robust modeling of functional protein states may require more accurate characterization of flexible regions in functional conformations and detection of high energy conformations. By incorporating a wider variety of protein structures in training datasets including both apo and holo forms, the model can learn to recognize and predict the structural changes that occur upon ligand binding.
蛋白质通常存在于多种构象状态中,受配体或底物结合的影响。对这些状态的研究,尤其是无配体(未结合)和有配体(配体结合)形式的研究,对于理解蛋白质功能、动力学和相互作用至关重要。在当前的研究中,我们使用了AlphaFold2,它将随机丙氨酸序列掩码与浅层多序列比对子采样相结合,以扩大预测结构集合的构象多样性,并捕获无配体和有配体蛋白质形式之间的构象变化。使用几个结构多样的无配体-有配体蛋白质对的成熟数据集,所提出的方法能够对无配体和有配体结构以及构象集合进行可靠的预测,同时还显示出显著相似的动力学分布。这些观察结果与以下观点一致,即变构蛋白的内在动力学由折叠的结构拓扑定义,并有利于由软模式驱动的保守构象运动。我们的研究结果支持这样一种观点,即AlphaFold2方法在预测无配体和有配体状态之间的微小构象调整方面可以产生合理的准确性,特别是对于在配体结合时具有适度局部变化的蛋白质。然而,对于大的、类似铰链的结构域运动,AlphaFold2倾向于预测最稳定的结构域取向,这通常是无配体形式,而不是有配体集合特有的全部功能构象范围。这些结果表明,对功能性蛋白质状态进行可靠建模可能需要更准确地表征功能构象中的柔性区域并检测高能构象。通过在训练数据集中纳入更多种类的蛋白质结构,包括无配体和有配体形式,该模型可以学会识别和预测配体结合时发生的结构变化。