Mohan Sidharth, Ozer Hatice Gulcin, Ray William C
Interdisciplinary Graduate Program in Biophysics, The Ohio State University, Columbus, OH, United States.
Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, IN, United States.
Front Bioinform. 2022 Apr 5;2:836526. doi: 10.3389/fbinf.2022.836526. eCollection 2022.
Small changes in a protein's core packing produce changes in function, and even small changes in function bias species fitness and survival. Therefore individually deleterious mutations should be evolutionarily coupled with compensating mutations that recover fitness. Co-evolving pairs of mutations should be littered across evolutionary history. Despite longstanding intuition, the results of co-evolution analyses have largely disappointed expectations. Regardless of the statistics applied, only a small majority of the most strongly co-evolving residues are typically found to be in contact, and much of the "meaning" of observed co-evolution has been opaque. In a medium-sized protein of 300 amino acids, there are almost 20 million potentially-important interdependencies. It is impossible to understand this data in textual format without extreme summarization or truncation. And, due to summarization and truncation, it is impossible to identify most patterns in the data. We developed a visualization approach that eschews the common "look at a long list of statistics" approach and instead enables the user to literally look at all of the co-evolution statistics simultaneously. Users of our tool reported visually obvious "clouds" of co-evolution statistics forming distinct patterns in the data, and analysis demonstrated that these clouds had structural relevance. To determine whether this phenomenon generalized, we repeated this experiment in three proteins we had not previously studied. The results provide evidence about how structural constrains have impacted co-evolution, why previous "examine the most frequently co-evolving residues" approaches have had limited success, and additionally shed light on the biophysical importance of different types of co-evolution.
蛋白质核心包装的微小变化会导致功能改变,而功能的微小变化也会影响物种的适应性和生存。因此,单个有害突变在进化上应与恢复适应性的补偿性突变相关联。共同进化的突变对应该在进化历史中广泛存在。尽管长期以来人们一直有这样的直觉,但共同进化分析的结果在很大程度上令人失望。无论应用何种统计方法,通常只有一小部分共同进化最强烈的残基被发现处于接触状态,而且观察到的共同进化的许多“意义”一直不明确。在一个由300个氨基酸组成的中等大小的蛋白质中,几乎有2000万个潜在的重要相互依赖关系。如果不进行极端的总结或删减,就不可能以文本形式理解这些数据。而且,由于总结和删减,不可能识别数据中的大多数模式。我们开发了一种可视化方法,摒弃了常见的“查看一长串统计数据”的方法,而是让用户能够同时直观地查看所有共同进化统计数据。我们工具的用户报告说,共同进化统计数据形成了视觉上明显的“云”,在数据中呈现出不同的模式,分析表明这些云具有结构相关性。为了确定这种现象是否具有普遍性,我们在之前未研究过的三种蛋白质中重复了这个实验。结果提供了有关结构限制如何影响共同进化的证据,解释了为什么以前“检查共同进化最频繁的残基”的方法取得的成功有限,此外还揭示了不同类型共同进化的生物物理重要性。