School of Computer, Electronic and Information, Guangxi University, Nanning, 530004, China.
IEEE/ACM Trans Comput Biol Bioinform. 2011 Nov-Dec;8(6):1535-44. doi: 10.1109/TCBB.2011.50.
Many raw biological sequence data have been generated by the human genome project and related efforts. The understanding of structural information encoded by biological sequences is important to acquire knowledge of their biochemical functions but remains a fundamental challenge. Recent interest in RNA regulation has resulted in a rapid growth of deposited RNA secondary structures in varied databases. However, a functional classification and characterization of the RNA structure have only been partially addressed. This article aims to introduce a novel interval-based distance metric for structure-based RNA function assignment. The characterization of RNA structures relies on distance vectors learned from a collection of predicted structures. The distance measure considers the intersected, disjoint, and inclusion between intervals. A set of RNA pseudoknotted structures with known function are applied and the function of the query structure is determined by measuring structure similarity. This not only offers sequence distance criteria to measure the similarity of secondary structures but also aids the functional classification of RNA structures with pesudoknots.
许多原始生物序列数据已经由人类基因组计划和相关的努力产生。理解生物序列所编码的结构信息对于获得其生化功能的知识很重要,但仍然是一个基本的挑战。最近对 RNA 调控的兴趣导致了各种数据库中 RNA 二级结构的快速增长。然而,RNA 结构的功能分类和特征化仅得到部分解决。本文旨在介绍一种基于区间的距离度量,用于基于结构的 RNA 功能分配。RNA 结构的特征化依赖于从一组预测结构中学习到的距离向量。该距离度量考虑了区间的相交、不相交和包含关系。一组具有已知功能的 RNA 假结结构被应用,并且通过测量结构相似性来确定查询结构的功能。这不仅提供了用于测量二级结构相似性的序列距离标准,而且有助于具有假结的 RNA 结构的功能分类。