Suppr超能文献

标准和饱和RNA二级结构的渐近性

Asymptotics of canonical and saturated RNA secondary structures.

作者信息

Clote Peter, Kranakis Evangelos, Krizanc Danny, Salvy Bruno

机构信息

Department of Biology, Boston College, Chestnut Hill, MA 02467, USA.

出版信息

J Bioinform Comput Biol. 2009 Oct;7(5):869-93. doi: 10.1142/s0219720009004333.

Abstract

It is a classical result of Stein and Waterman that the asymptotic number of RNA secondary structures is 1.104366 . n(-3/2) . 2.618034(n). In this paper, we study combinatorial asymptotics for two special subclasses of RNA secondary structures - canonical and saturated structures. Canonical secondary structures are defined to have no lonely (isolated) base pairs. This class of secondary structures was introduced by Bompfünewerer et al., who noted that the run time of Vienna RNA Package is substantially reduced when restricting computations to canonical structures. Here we provide an explanation for the speed-up, by proving that the asymptotic number of canonical RNA secondary structures is 2.1614 . n(-3/2) . 1.96798(n) and that the expected number of base pairs in a canonical secondary structure is 0.31724 . n. The asymptotic number of canonical secondary structures was obtained much earlier by Hofacker, Schuster and Stadler using a different method. Saturated secondary structures have the property that no base pairs can be added without violating the definition of secondary structure (i.e. introducing a pseudoknot or base triple). Here we show that the asymptotic number of saturated structures is 1.07427 . n(-3/2) . 2.35467(n), the asymptotic expected number of base pairs is 0.337361 . n, and the asymptotic number of saturated stem-loop structures is 0.323954 . 1.69562(n), in contrast to the number 2(n - 2) of (arbitrary) stem-loop structures as classically computed by Stein and Waterman. Finally, we apply the work of Drmota to show that the density of states for [all resp. canonical resp. saturated] secondary structures is asymptotically Gaussian. We introduce a stochastic greedy method to sample random saturated structures, called quasi-random saturated structures, and show that the expected number of base pairs is 0.340633 . n.

摘要

斯坦因(Stein)和沃特曼(Waterman)的一个经典结果是,RNA二级结构的渐近数量为1.104366·n^(-3/2)·2.618034^n。在本文中,我们研究RNA二级结构的两个特殊子类——规范结构和饱和结构的组合渐近性。规范二级结构被定义为不存在孤立(孤单)碱基对。这类二级结构由邦普夫纽韦勒(Bompfünewerer)等人引入,他们指出,当将计算限制在规范结构时,维也纳RNA软件包的运行时间会大幅减少。在此,我们通过证明规范RNA二级结构的渐近数量为2.1614·n^(-3/2)·1.96798^n,且规范二级结构中碱基对的期望数量为0.31724·n,来解释这种加速现象。规范二级结构的渐近数量早在之前就由霍法克(Hofacker)、舒斯特(Schuster)和施塔德勒(Stadler)用不同方法得到。饱和二级结构具有这样的性质:在不违反二级结构定义(即不引入假结或碱基三联体)的情况下,不能添加任何碱基对。在此我们表明,饱和结构的渐近数量为1.07427·n^(-3/2)·2.35467^n,碱基对的渐近期望数量为0.337361·n,饱和茎环结构的渐近数量为0.323954·1.69562^n,这与斯坦因和沃特曼经典计算的(任意)茎环结构数量2^(n - 2)形成对比。最后,我们应用德莫塔(Drmota)的工作表明,[所有、规范、饱和]二级结构的态密度渐近呈高斯分布。我们引入一种随机贪心方法来采样随机饱和结构,称为准随机饱和结构,并表明碱基对的期望数量为0.340633·n。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验