School of Information Science and Engineering, Central South University, Changsha 410083, China.
School of Computer (Software), Ping Ding Shan University, Pingdingshan 467000, China.
Bioinformatics. 2018 May 15;34(10):1750-1757. doi: 10.1093/bioinformatics/btx833.
Long non-coding RNAs (lncRNAs) are an enormous collection of functional non-coding RNAs. Over the past decades, a large number of novel lncRNA genes have been identified. However, most of the lncRNAs remain function uncharacterized at present. Computational approaches provide a new insight to understand the potential functional implications of lncRNAs.
Considering that each lncRNA may have multiple functions and a function may be further specialized into sub-functions, here we describe NeuraNetL2GO, a computational ontological function prediction approach for lncRNAs using hierarchical multi-label classification strategy based on multiple neural networks. The neural networks are incrementally trained level by level, each performing the prediction of gene ontology (GO) terms belonging to a given level. In NeuraNetL2GO, we use topological features of the lncRNA similarity network as the input of the neural networks and employ the output results to annotate the lncRNAs. We show that NeuraNetL2GO achieves the best performance and the overall advantage in maximum F-measure and coverage on the manually annotated lncRNA2GO-55 dataset compared to other state-of-the-art methods.
The source code and data are available at http://denglab.org/NeuraNetL2GO/.
Supplementary data are available at Bioinformatics online.
长非编码 RNA(lncRNA)是一类庞大的具有功能的非编码 RNA。在过去的几十年中,大量新的 lncRNA 基因被鉴定出来。然而,目前大多数 lncRNA 的功能仍未被阐明。计算方法为理解 lncRNA 的潜在功能提供了新的视角。
考虑到每个 lncRNA 可能具有多种功能,并且一种功能可能进一步细分为子功能,我们在这里描述了 NeuraNetL2GO,这是一种使用基于多个神经网络的层次多标签分类策略对 lncRNA 进行计算性本体功能预测的方法。神经网络逐步分层训练,每个神经网络执行属于给定层次的基因本体 (GO) 术语的预测。在 NeuraNetL2GO 中,我们将 lncRNA 相似性网络的拓扑特征作为神经网络的输入,并使用输出结果对 lncRNA 进行注释。我们表明,与其他最先进的方法相比,NeuraNetL2GO 在手动注释的 lncRNA2GO-55 数据集上的最大 F 度量和覆盖率方面实现了最佳性能和整体优势。
源代码和数据可在 http://denglab.org/NeuraNetL2GO/ 获得。
补充数据可在生物信息学在线获得。