Suppr超能文献

RNA的拓扑语言

Topological language for RNA.

作者信息

Huang Fenix W D, Reidys Christian M

机构信息

Biocomplexity Institute of Virginia Tech, Virginia Tech, United States.

出版信息

Math Biosci. 2016 Dec;282:109-120. doi: 10.1016/j.mbs.2016.10.006. Epub 2016 Oct 20.

Abstract

UNLABELLED

In this paper we introduce a novel, context-free grammar, RNAFeatures, capable of generating any RNA structure including pseudoknot structures (pk-structure). We represent pk-structures as orientable fatgraphs, which naturally leads to a filtration by their topological genus. Within this framework, RNA secondary structures correspond to pk-structures of genus zero. RNAFeatures acts on formal, arc-labeled RNA secondary structures, called λ-structures. λ-structures correspond one-to-one to pk-structures together with some additional information. This information consists of the specific rearrangement of the backbone, by which a pk-structure can be made cross-free. RNAFeatures is an extension of the grammar for secondary structures and employs an enhancement by labelings of the symbols as well as the production rules. We discuss how to use RNAFeatures to obtain a stochastic context-free grammar for pk-structures, using data of RNA sequences and structures. The induced grammar facilitates fast Boltzmann sampling and statistical analysis. As a first application, we present an O(nlog (n)) runtime algorithm which samples pk-structures based on ninety tRNA sequences and structures from the Nucleic Acid Database (NDB).

AVAILABILITY

the source code for simulation results is available at http://staff.vbi.vt.edu/fenixh/TPstructure.zip. The code is written in C and compiled by Xcode.

摘要

未标注

在本文中,我们介绍了一种新颖的、上下文无关语法RNAFeatures,它能够生成任何RNA结构,包括假结结构(pk结构)。我们将pk结构表示为可定向胖图,这自然会导致按其拓扑亏格进行过滤。在此框架内,RNA二级结构对应于亏格为零的pk结构。RNAFeatures作用于形式化的、带弧标记的RNA二级结构,称为λ结构。λ结构与pk结构一一对应,并带有一些附加信息。此信息包括主链的特定重排,通过这种重排可以使pk结构无交叉。RNAFeatures是二级结构语法的扩展,它通过对符号以及产生规则进行标记来实现增强。我们讨论了如何利用RNA序列和结构数据,使用RNAFeatures来获得pk结构的随机上下文无关语法。由此产生的语法便于进行快速玻尔兹曼采样和统计分析。作为第一个应用,我们提出了一种运行时间为O(nlog(n))的算法,该算法基于核酸数据库(NDB)中的90个tRNA序列和结构对pk结构进行采样。

可用性

模拟结果的源代码可在http://staff.vbi.vt.edu/fenixh/TPstructure.zip获取。代码用C编写,并由Xcode编译。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验