Suppr超能文献

lncLocator:一种基于堆叠集成分类器的长非编码 RNA 亚细胞定位预测器。

The lncLocator: a subcellular localization predictor for long non-coding RNAs based on a stacked ensemble classifier.

机构信息

Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, China.

Department of Medical Informatics, Erasmus MC, Rotterdam, The Netherlands.

出版信息

Bioinformatics. 2018 Jul 1;34(13):2185-2194. doi: 10.1093/bioinformatics/bty085.

Abstract

MOTIVATION

The long non-coding RNA (lncRNA) studies have been hot topics in the field of RNA biology. Recent studies have shown that their subcellular localizations carry important information for understanding their complex biological functions. Considering the costly and time-consuming experiments for identifying subcellular localization of lncRNAs, computational methods are urgently desired. However, to the best of our knowledge, there are no computational tools for predicting the lncRNA subcellular locations to date.

RESULTS

In this study, we report an ensemble classifier-based predictor, lncLocator, for predicting the lncRNA subcellular localizations. To fully exploit lncRNA sequence information, we adopt both k-mer features and high-level abstraction features generated by unsupervised deep models, and construct four classifiers by feeding these two types of features to support vector machine (SVM) and random forest (RF), respectively. Then we use a stacked ensemble strategy to combine the four classifiers and get the final prediction results. The current lncLocator can predict five subcellular localizations of lncRNAs, including cytoplasm, nucleus, cytosol, ribosome and exosome, and yield an overall accuracy of 0.59 on the constructed benchmark dataset.

AVAILABILITY AND IMPLEMENTATION

The lncLocator is available at www.csbio.sjtu.edu.cn/bioinf/lncLocator.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

长链非编码 RNA(lncRNA)的研究一直是 RNA 生物学领域的热门话题。最近的研究表明,它们的亚细胞定位携带了理解其复杂生物学功能的重要信息。考虑到鉴定 lncRNA 亚细胞定位的昂贵和耗时的实验,迫切需要计算方法。然而,据我们所知,目前还没有用于预测 lncRNA 亚细胞位置的计算工具。

结果

在这项研究中,我们报告了一种基于集成分类器的预测器 lncLocator,用于预测 lncRNA 的亚细胞定位。为了充分利用 lncRNA 序列信息,我们采用了 k-mer 特征和无监督深度模型生成的高级抽象特征,并通过分别向支持向量机(SVM)和随机森林(RF)馈送这两种类型的特征,构建了四个分类器。然后,我们使用堆叠集成策略来组合这四个分类器并获得最终的预测结果。目前的 lncLocator 可以预测 lncRNA 的五个亚细胞定位,包括细胞质、细胞核、胞浆、核糖体和外泌体,并在构建的基准数据集上获得 0.59 的总体准确率。

可用性和实现

lncLocator 可在 www.csbio.sjtu.edu.cn/bioinf/lncLocator 获得。

补充信息

补充数据可在生物信息学在线获得。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验