Reidys C, Stadler P F, Schuster P
Santa Fe Institute, NM 87501, USA.
Bull Math Biol. 1997 Mar;59(2):339-97. doi: 10.1007/BF02462007.
Random graph theory is used to model and analyse the relationships between sequences and secondary structures of RNA molecules, which are understood as mappings from sequence space into shape space. These maps are non-invertible since there are always many orders of magnitude more sequences than structures. Sequences folding into identical structures form neutral networks. A neutral network is embedded in the set of sequences that are compatible with the given structure. Networks are modeled as graphs and constructed by random choice of vertices from the space of compatible sequences. The theory characterizes neutral networks by the mean fraction of neutral neighbors (lambda). The networks are connected and percolate sequence space if the fraction of neutral nearest neighbors exceeds a threshold value (lambda > lambda *). Below threshold (lambda < lambda *), the networks are partitioned into a largest "giant" component and several smaller components. Structures are classified as "common" or "rare" according to the sizes of their pre-images, i.e. according to the fractions of sequences folding into them. The neutral networks of any pair of two different common structures almost touch each other, and, as expressed by the conjecture of shape space covering sequences folding into almost all common structures, can be found in a small ball of an arbitrary location in sequence space. The results from random graph theory are compared to data obtained by folding large samples of RNA sequences. Differences are explained in terms of specific features of RNA molecular structures.
随机图论用于对RNA分子的序列与二级结构之间的关系进行建模和分析,这些关系被理解为从序列空间到形状空间的映射。由于序列的数量总是比结构多很多个数量级,所以这些映射是不可逆的。折叠成相同结构的序列形成中性网络。中性网络嵌入在与给定结构兼容的序列集合中。网络被建模为图,并通过从兼容序列空间中随机选择顶点来构建。该理论通过中性邻居的平均比例(λ)来表征中性网络。如果中性最近邻的比例超过阈值(λ > λ*),则网络是连通的并且渗透序列空间。在阈值以下(λ < λ*),网络被划分为一个最大的“巨型”组件和几个较小的组件。根据其原像的大小,即根据折叠成它们的序列的比例,将结构分类为“常见”或“罕见”。任意一对不同常见结构的中性网络几乎相互接触,并且,正如形状空间覆盖猜想所表达的那样,折叠成几乎所有常见结构的序列可以在序列空间中任意位置的一个小球中找到。将随机图论的结果与通过折叠大量RNA序列样本获得的数据进行比较。差异根据RNA分子结构的特定特征进行解释。