Department of Chemistry, Michigan State University, East Lansing, Michigan 48824-1322, USA.
J Chem Phys. 2018 Mar 14;148(10):105102. doi: 10.1063/1.5010428.
Intrinsically disordered proteins (IDPs) sample a diverse conformational space. They are important to signaling and regulatory pathways in cells. An entropy penalty must be payed when an IDP becomes ordered upon interaction with another protein or a ligand. Thus, the degree of conformational disorder of an IDP is of interest. We create a dichotomic Markov model that can explore entropic features of an IDP. The Markov condition introduces local (neighbor residues in a protein sequence) rotamer dependences that arise from van der Waals and other chemical constraints. A protein sequence of length N is characterized by its (information) entropy and mutual information, MIMC, the latter providing a measure of the dependence among the random variables describing the rotamer probabilities of the residues that comprise the sequence. For a Markov chain, the MIMC is proportional to the pair mutual information MI which depends on the singlet and pair probabilities of neighbor residue rotamer sampling. All 2 sequence states are generated, along with their probabilities, and contrasted with the probabilities under the assumption of independent residues. An efficient method to generate realizations of the chain is also provided. The chain entropy, MIMC, and state probabilities provide the ingredients to distinguish different scenarios using the terminologies: MoRF (molecular recognition feature), not-MoRF, and not-IDP. A MoRF corresponds to large entropy and large MIMC (strong dependence among the residues' rotamer sampling), a not-MoRF corresponds to large entropy but small MIMC, and not-IDP corresponds to low entropy irrespective of the MIMC. We show that MorFs are most appropriate as descriptors of IDPs. They provide a reasonable number of high-population states that reflect the dependences between neighbor residues, thus classifying them as IDPs, yet without very large entropy that might lead to a too high entropy penalty.
无规卷曲蛋白质(IDP)能够在不同构象间自由转变。它们在细胞的信号转导和调控途径中发挥着重要作用。当 IDP 与其他蛋白质或配体相互作用时,其有序性会增加,因此必须付出熵值的代价。因此,IDP 的构象无序程度是值得关注的。我们创建了一个二分马尔可夫模型,可以探索 IDP 的熵特征。马尔可夫条件引入了局部(蛋白质序列中的相邻残基)构象依赖性,这些依赖性源于范德华力和其他化学约束。长度为 N 的蛋白质序列的特征是其(信息)熵和互信息 MIMC,后者提供了描述组成序列的残基的构象概率的随机变量之间的依赖关系的度量。对于马尔可夫链,MIMC 与对映体互信息 MI 成正比,后者取决于相邻残基构象采样的单态和对态概率。生成所有 2 个序列状态及其概率,并与假设独立残基的情况下的概率进行对比。还提供了一种生成链实现的有效方法。链熵、MIMC 和状态概率提供了使用以下术语区分不同情况的要素:MoRF(分子识别特征)、非-MoRF 和非-IDP。MoRF 对应于大熵和大 MIMC(残基构象采样之间的强依赖性),非-MoRF 对应于大熵但小 MIMC,非-IDP 对应于低熵,而与 MIMC 无关。我们表明,MoRF 最适合作为 IDP 的描述符。它们提供了数量合理的高种群状态,反映了相邻残基之间的依赖性,从而将它们分类为 IDP,但没有过高的熵,这可能导致过高的熵代价。