Zhang Jingpu, Zou Shuai, Deng Lei
School of Computer and Data Science, Henan University of Urban Construction, Pingdingshan, 467000, China.
School of Information Science and Engineering, Central South University, Changsha, 410083, China.
BMC Med Genomics. 2018 Nov 20;11(Suppl 5):99. doi: 10.1186/s12920-018-0414-2.
With the development of sequencing technology, more and more long non-coding RNAs (lncRNAs) have been identified. Some lncRNAs have been confirmed that they play an important role in the process of development through the dosage compensation effect, epigenetic regulation, cell differentiation regulation and other aspects. However, the majority of the lncRNAs have not been functionally characterized. Explore the function of lncRNAs and the regulatory network has become a hot research topic currently.
In the work, a network-based model named BiRWLGO is developed. The ultimate goal is to predict the probable functions for lncRNAs at large scale. The new model starts with building a global network composed of three networks: lncRNA similarity network, lncRNA-protein association network and protein-protein interaction (PPI) network. After that, it utilizes bi-random walk algorithm to explore the similarities between lncRNAs and proteins. Finally, we can annotate an lncRNA with the Gene Ontology (GO) terms according to its neighboring proteins.
We compare the performance of BiRWLGO with the state-of-the-art models on a manually annotated lncRNA benchmark with known GO terms. The experimental results assert that BiRWLGO outperforms other methods in terms of both maximum F-measure (F) and coverage.
BiRWLGO is a relatively efficient method to predict the functions of lncRNA. When protein interaction data is integrated, the predictive performance of BiRWLGO gains a great improvement.
随着测序技术的发展,越来越多的长链非编码RNA(lncRNA)被鉴定出来。一些lncRNA已被证实通过剂量补偿效应、表观遗传调控、细胞分化调控等方面在发育过程中发挥重要作用。然而,大多数lncRNA的功能尚未得到表征。探索lncRNA的功能及其调控网络已成为当前热门的研究课题。
在这项工作中,开发了一种名为BiRWLGO的基于网络的模型。其最终目标是大规模预测lncRNA的可能功能。新模型首先构建一个由三个网络组成的全局网络:lncRNA相似性网络、lncRNA-蛋白质关联网络和蛋白质-蛋白质相互作用(PPI)网络。之后,利用双随机游走算法探索lncRNA与蛋白质之间的相似性。最后,根据lncRNA的相邻蛋白质用基因本体(GO)术语对其进行注释。
我们在一个带有已知GO术语的人工注释lncRNA基准上,将BiRWLGO的性能与最先进的模型进行了比较。实验结果表明,BiRWLGO在最大F值(F)和覆盖率方面均优于其他方法。
BiRWLGO是一种相对有效的预测lncRNA功能的方法。当整合蛋白质相互作用数据时,BiRWLGO的预测性能有很大提高。