Swiss Institute of Bioinformatics, Lausanne, Switzerland.
PLoS One. 2011;6(5):e20488. doi: 10.1371/journal.pone.0020488. Epub 2011 May 27.
Intrinsically disordered proteins (IDPs) or proteins with disordered regions (IDRs) do not have a well-defined tertiary structure, but perform a multitude of functions, often relying on their native disorder to achieve the binding flexibility through changing to alternative conformations. Intrinsic disorder is frequently found in all three kingdoms of life, and may occur in short stretches or span whole proteins. To date most studies contrasting the differences between ordered and disordered proteins focused on simple summary statistics. Here, we propose an evolutionary approach to study IDPs, and contrast patterns specific to ordered protein regions and the corresponding IDRs.
Two empirical Markov models of amino acid substitutions were estimated, based on a large set of multiple sequence alignments with experimentally verified annotations of disordered regions from the DisProt database of IDPs. We applied new methods to detect differences in Markovian evolution and evolutionary rates between IDRs and the corresponding ordered protein regions. Further, we investigated the distribution of IDPs among functional categories, biochemical pathways and their preponderance to contain tandem repeats.
We find significant differences in the evolution between ordered and disordered regions of proteins. Most importantly we find that disorder promoting amino acids are more conserved in IDRs, indicating that in some cases not only amino acid composition but the specific sequence is important for function. This conjecture is also reinforced by the observation that for of our data set IDRs evolve more slowly than the ordered parts of the proteins, while we still support the common view that IDRs in general evolve more quickly. The improvement in model fit indicates a possible improvement for various types of analyses e.g. de novo disorder prediction using a phylogenetic Hidden Markov Model based on our matrices showed a performance similar to other disorder predictors.
无序蛋白质(IDPs)或具有无序区域(IDRs)的蛋白质没有明确的三级结构,但具有多种功能,通常依赖于其固有无序性通过改变到替代构象来实现结合灵活性。内在无序性经常存在于所有三个生命领域,并且可能出现在短链或整个蛋白质中。迄今为止,大多数对比有序和无序蛋白质差异的研究都集中在简单的总结统计上。在这里,我们提出了一种研究 IDPs 的进化方法,并对比了有序蛋白质区域和相应 IDRs 的特定模式。
基于来自 IDPs 的 DisProt 数据库中具有实验验证的无序区域注释的大量多序列比对,估计了两个氨基酸替换的经验 Markov 模型。我们应用了新的方法来检测 IDRs 和相应有序蛋白质区域之间的 Markov 进化和进化率的差异。此外,我们研究了 IDPs 在功能类别、生化途径中的分布及其串联重复的优势。
我们发现蛋白质的有序和无序区域之间的进化存在显著差异。最重要的是,我们发现促进无序的氨基酸在 IDRs 中更保守,这表明在某些情况下,不仅氨基酸组成,而且特定序列对功能很重要。这种推测也得到了这样一种观察结果的支持,即对于我们的数据集的一部分,IDRs 的进化速度比蛋白质的有序部分慢,而我们仍然支持 IDRs 通常进化得更快的一般观点。模型拟合的改进表明可能会改进各种类型的分析,例如使用基于我们矩阵的基于系统发育的隐马尔可夫模型进行从头预测无序的性能与其他无序预测器相似。