Suppr超能文献

通过特征集成学习预测长非编码 RNA。

Predicting Long non-coding RNAs through feature ensemble learning.

机构信息

College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China.

出版信息

BMC Genomics. 2020 Dec 17;21(Suppl 13):865. doi: 10.1186/s12864-020-07237-y.

Abstract

BACKGROUND

Many transcripts have been generated due to the development of sequencing technologies, and lncRNA is an important type of transcript. Predicting lncRNAs from transcripts is a challenging and important task. Traditional experimental lncRNA prediction methods are time-consuming and labor-intensive. Efficient computational methods for lncRNA prediction are in demand.

RESULTS

In this paper, we propose two lncRNA prediction methods based on feature ensemble learning strategies named LncPred-IEL and LncPred-ANEL. Specifically, we encode sequences into six different types of features including transcript-specified features and general sequence-derived features. Then we consider two feature ensemble strategies to utilize and integrate the information in different feature types, the iterative ensemble learning (IEL) and the attention network ensemble learning (ANEL). IEL employs a supervised iterative way to ensemble base predictors built on six different types of features. ANEL introduces an attention mechanism-based deep learning model to ensemble features by adaptively learning the weight of individual feature types. Experiments demonstrate that both LncPred-IEL and LncPred-ANEL can effectively separate lncRNAs and other transcripts in feature space. Moreover, comparison experiments demonstrate that LncPred-IEL and LncPred-ANEL outperform several state-of-the-art methods when evaluated by 5-fold cross-validation. Both methods have good performances in cross-species lncRNA prediction.

CONCLUSIONS

LncPred-IEL and LncPred-ANEL are promising lncRNA prediction tools that can effectively utilize and integrate the information in different types of features.

摘要

背景

随着测序技术的发展,产生了许多转录本,而长链非编码 RNA(lncRNA)是转录本的重要类型之一。从转录本中预测 lncRNA 是一项具有挑战性和重要的任务。传统的实验性 lncRNA 预测方法既耗时又费力。因此,需要高效的计算方法来预测 lncRNA。

结果

在本文中,我们提出了两种基于特征集成学习策略的 lncRNA 预测方法,分别命名为 LncPred-IEL 和 LncPred-ANEL。具体来说,我们将序列编码为包括转录特有序列和一般序列衍生特征在内的六种不同类型的特征。然后,我们考虑了两种特征集成策略来利用和整合不同特征类型中的信息,即迭代集成学习(IEL)和注意力网络集成学习(ANEL)。IEL 采用有监督的迭代方式来集成基于六种不同类型特征构建的基础预测器。ANEL 通过自适应学习各个特征类型的权重,引入基于注意力机制的深度学习模型来集成特征。实验表明,LncPred-IEL 和 LncPred-ANEL 都可以在特征空间中有效地将 lncRNA 与其他转录本区分开来。此外,对比实验表明,在 5 倍交叉验证评估中,LncPred-IEL 和 LncPred-ANEL 优于几种最先进的方法。这两种方法在跨物种 lncRNA 预测中都具有良好的性能。

结论

LncPred-IEL 和 LncPred-ANEL 是很有前途的 lncRNA 预测工具,可以有效地利用和整合不同类型特征中的信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65f9/7745355/53435e53eb99/12864_2020_7237_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验