School of Science and Technology, University of Camerino, Via Madonna della Carceri 9, Camerino, 62032, Italy.
BMC Bioinformatics. 2019 Apr 18;20(Suppl 4):161. doi: 10.1186/s12859-019-2689-5.
RNA secondary structure comparison is a fundamental task for several studies, among which are RNA structure prediction and evolution. The comparison can currently be done efficiently only for pseudoknot-free structures due to their inherent tree representation.
In this work, we introduce an algebraic language to represent RNA secondary structures with arbitrary pseudoknots. Each structure is associated with a unique algebraic RNA tree that is derived from a tree grammar having concatenation, nesting and crossing as operators. From an algebraic RNA tree, an abstraction is defined in which the primary structure is neglected. The resulting structural RNA tree allows us to define a new measure of similarity calculated exploiting classical tree alignment.
The tree grammar with its operators permit to uniquely represent any RNA secondary structure as a tree. Structural RNA trees allow us to perform comparison of RNA secondary structures with arbitrary pseudoknots without taking into account the primary structure.
RNA 二级结构比较是多项研究的基础任务,其中包括 RNA 结构预测和进化。由于其固有的树表示形式,目前仅能有效地对无假结结构进行比较。
在这项工作中,我们引入了一种代数语言来表示具有任意假结的 RNA 二级结构。每个结构都与一个独特的代数 RNA 树相关联,该树源自具有串联、嵌套和交叉运算符的树语法。从代数 RNA 树中,可以定义一个抽象,其中忽略了主要结构。由此产生的结构 RNA 树允许我们定义一种新的相似性度量,该度量利用经典树对齐来计算。
该运算符的树语法允许将任何 RNA 二级结构唯一地表示为一棵树。结构 RNA 树允许我们在不考虑主要结构的情况下对具有任意假结的 RNA 二级结构进行比较。