Suppr超能文献

iSS-PseDNC:利用伪二核苷酸组成识别剪接位点。

iSS-PseDNC: identifying splicing sites using pseudo dinucleotide composition.

作者信息

Chen Wei, Feng Peng-Mian, Lin Hao, Chou Kuo-Chen

机构信息

Department of Physics, School of Sciences, Center for Genomics and Computational Biology, Hebei United University, Tangshan 063000, China ; Gordon Life Science Institute, Boston, MA 02478, USA.

School of Public Health, Hebei United University, Tangshan 063000, China.

出版信息

Biomed Res Int. 2014;2014:623149. doi: 10.1155/2014/623149. Epub 2014 May 21.

Abstract

In eukaryotic genes, exons are generally interrupted by introns. Accurately removing introns and joining exons together are essential processes in eukaryotic gene expression. With the avalanche of genome sequences generated in the postgenomic age, it is highly desired to develop automated methods for rapid and effective detection of splice sites that play important roles in gene structure annotation and even in RNA splicing. Although a series of computational methods were proposed for splice site identification, most of them neglected the intrinsic local structural properties. In the present study, a predictor called "iSS-PseDNC" was developed for identifying splice sites. In the new predictor, the sequences were formulated by a novel feature-vector called "pseudo dinucleotide composition" (PseDNC) into which six DNA local structural properties were incorporated. It was observed by the rigorous cross-validation tests on two benchmark datasets that the overall success rates achieved by iSS-PseDNC in identifying splice donor site and splice acceptor site were 85.45% and 87.73%, respectively. It is anticipated that iSS-PseDNC may become a useful tool for identifying splice sites and that the six DNA local structural properties described in this paper may provide novel insights for in-depth investigations into the mechanism of RNA splicing.

摘要

在真核生物基因中,外显子通常被内含子打断。准确去除内含子并将外显子连接在一起是真核生物基因表达中的关键过程。随着后基因组时代产生的大量基因组序列,迫切需要开发自动化方法来快速有效地检测在基因结构注释甚至RNA剪接中起重要作用的剪接位点。尽管已经提出了一系列用于识别剪接位点的计算方法,但其中大多数都忽略了内在的局部结构特性。在本研究中,开发了一种名为“iSS-PseDNC”的预测器来识别剪接位点。在新的预测器中,序列由一种名为“伪二核苷酸组成”(PseDNC)的新型特征向量表示,该向量纳入了六种DNA局部结构特性。通过对两个基准数据集进行的严格交叉验证测试观察到,iSS-PseDNC在识别剪接供体位点和剪接受体位点方面的总体成功率分别为85.45%和87.73%。预计iSS-PseDNC可能成为识别剪接位点的有用工具,并且本文描述的六种DNA局部结构特性可能为深入研究RNA剪接机制提供新的见解。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验