Liu Bin, Liu Fule, Fang Longyun, Wang Xiaolong, Chou Kuo-Chen
School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, 518055, Guangdong, China.
Key Laboratory of Network Oriented Intelligent Computation, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, 518055, Guangdong, China.
Mol Genet Genomics. 2016 Feb;291(1):473-81. doi: 10.1007/s00438-015-1078-7. Epub 2015 Jun 18.
With the rapid growth of RNA sequences generated in the postgenomic age, it is highly desired to develop a flexible method that can generate various kinds of vectors to represent these sequences by focusing on their different features. This is because nearly all the existing machine-learning methods, such as SVM (support vector machine) and KNN (k-nearest neighbor), can only handle vectors but not sequences. To meet the increasing demands and speed up the genome analyses, we have developed a new web server, called "representations of RNA sequences" (repRNA). Compared with the existing methods, repRNA is much more comprehensive, flexible and powerful, as reflected by the following facts: (1) it can generate 11 different modes of feature vectors for users to choose according to their investigation purposes; (2) it allows users to select the features from 22 built-in physicochemical properties and even those defined by users' own; (3) the resultant feature vectors and the secondary structures of the corresponding RNA sequences can be visualized. The repRNA web server is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/repRNA/ .
随着后基因组时代生成的RNA序列迅速增长,迫切需要开发一种灵活的方法,该方法能够通过关注RNA序列的不同特征来生成各种载体以表示这些序列。这是因为几乎所有现有的机器学习方法,如支持向量机(SVM)和k近邻算法(KNN),都只能处理向量而不能处理序列。为了满足不断增长的需求并加快基因组分析速度,我们开发了一个名为“RNA序列表示”(repRNA)的新网络服务器。与现有方法相比,repRNA更加全面、灵活且功能强大,具体体现在以下几个方面:(1)它可以生成11种不同模式的特征向量,供用户根据研究目的进行选择;(2)它允许用户从22种内置的物理化学性质中选择特征,甚至可以选择用户自己定义的特征;(iii)所得的特征向量和相应RNA序列的二级结构可以可视化。repRNA网络服务器可通过http://bioinformatics.hitsz.edu.cn/repRNA/免费供公众使用。