Schuster P, Stadler P F
Institut für Theoretische Chemie, Universität Wien, Austria.
Comput Chem. 1994 Sep;18(3):295-324. doi: 10.1016/0097-8485(94)85025-9.
The evolution of RNA molecules in replication assays, viroids and RNA viruses can be viewed as an adaptation process on a 'fitness' landscape. The dynamics of evolution is hence tightly linked to the structure of the underlying landscape. Global features of landscapes can be described by statistical measures like number of optima, lengths of walks and correlation functions. The evolution of a quasispecies on such landscapes exhibits three dynamical regimes depending on the replication fidelity: Above the "localization threshold" the population is centered around a (local) optimum. Between localization and "dispersion threshold" the population is still centered around a consensus sequence, which, however, changes in time. For very large mutation rates the population spreads in sequence space like a gas. The critical mutation rates separating the three domains depend strongly on characteristics properties of the fitness landscapes. Statistical characteristics of RNA landscapes are accessible by mathematical analysis and computer calculations on the level of secondary structures: these RNA landscapes belong to the same class as well known optimization problems and simple spin glass models. The notion of a landscape is extended to combinatory maps, thereby allowing for a direct statistical investigation of the sequence structure relationships of RNA at the level of secondary structures. Frequencies of structures are highly non-uniform: we find relatively few common and many rare ones, as expressed by a generalized form of Zipf's law. Using an algorithm for inverse folding we show that sequences sharing the same structure are distributed randomly over sequence space. Together with calculations of structure correlations and a survey of neutral mutations this provides convincing evidence that RNA landscapes are as simple as they could possibly be for evolutionary adaptation: Any desired secondary structure can be found close to an arbitrary initial sequence and at the same time almost all bases can be substituted sequentially without ever changing the shape of the molecule. Consequences of these results for evolutionary optimization, the early stages of life, and molecular biotechnology are discussed.
在复制实验、类病毒和RNA病毒中,RNA分子的进化可被视为在“适应度”景观上的一个适应过程。因此,进化动力学与基础景观的结构紧密相连。景观的全局特征可用诸如最优解数量、游走长度和相关函数等统计量来描述。在这样的景观上,准种的进化根据复制保真度呈现出三种动力学状态:高于“定位阈值”时,群体集中在一个(局部)最优解周围。在定位阈值和“扩散阈值”之间,群体仍集中在一个共有序列周围,不过该共有序列会随时间变化。对于非常高的突变率,群体在序列空间中像气体一样扩散。分隔这三个区域的临界突变率强烈依赖于适应度景观的特征性质。通过对二级结构层面的数学分析和计算机计算,可以获取RNA景观的统计特征:这些RNA景观与众所周知的优化问题和简单自旋玻璃模型属于同一类别。景观的概念被扩展到组合映射,从而能够在二级结构层面直接对RNA的序列结构关系进行统计研究。结构频率极不均匀:正如齐普夫定律的广义形式所表达的,我们发现相对较少的常见结构和许多罕见结构。使用反向折叠算法,我们表明具有相同结构的序列在序列空间中随机分布。结合结构相关性计算和中性突变调查,这提供了令人信服的证据,表明RNA景观对于进化适应来说尽可能简单:任何所需的二级结构都可以在任意初始序列附近找到,同时几乎所有碱基都可以依次替换而不会改变分子的形状。本文讨论了这些结果对进化优化、生命早期阶段和分子生物技术的影响。