Suppr超能文献

PredLnc-GFStack:一种基于堆叠集成学习方法的全局序列特征,用于从转录本中预测 lncRNAs。

PredLnc-GFStack: A Global Sequence Feature Based on a Stacked Ensemble Learning Method for Predicting lncRNAs from Transcripts.

机构信息

College of Informatics, Huazhong Agricultural University, Wuhan 430070, China.

School of Computer Science, Wuhan University, Wuhan 430072, China.

出版信息

Genes (Basel). 2019 Sep 3;10(9):672. doi: 10.3390/genes10090672.

Abstract

Long non-coding RNAs (lncRNAs) are a class of RNAs with the length exceeding 200 base pairs (bps), which do not encode proteins, nevertheless, lncRNAs have many vital biological functions. A large number of novel transcripts were discovered as a result of the development of high-throughput sequencing technology. Under this circumstance, computational methods for lncRNA prediction are in great demand. In this paper, we consider global sequence features and propose a stacked ensemble learning-based method to predict lncRNAs from transcripts, abbreviated as PredLnc-GFStack. We extract the critical features from the candidate feature list using the genetic algorithm (GA) and then employ the stacked ensemble learning method to construct PredLnc-GFStack model. Computational experimental results show that PredLnc-GFStack outperforms several state-of-the-art methods for lncRNA prediction. Furthermore, PredLnc-GFStack demonstrates an outstanding ability for cross-species ncRNA prediction.

摘要

长链非编码 RNA(lncRNA)是一类长度超过 200 个碱基对(bp)的 RNA,不编码蛋白质,但 lncRNA 具有许多重要的生物学功能。高通量测序技术的发展发现了大量新的转录本。在这种情况下,对 lncRNA 预测的计算方法的需求很大。在本文中,我们考虑全局序列特征,并提出了一种基于堆叠集成学习的方法从转录本中预测 lncRNA,简称为 PredLnc-GFStack。我们使用遗传算法(GA)从候选特征列表中提取关键特征,然后使用堆叠集成学习方法构建 PredLnc-GFStack 模型。计算实验结果表明,PredLnc-GFStack 在 lncRNA 预测方面优于几种先进的方法。此外,PredLnc-GFStack 还展示了出色的跨物种 ncRNA 预测能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a455/6770532/00b22d0b7e40/genes-10-00672-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验