Suppr超能文献

用于预测lncRNA-蛋白质相互作用的基于快速核学习到核岭回归的多变量信息融合

Multivariate Information Fusion With Fast Kernel Learning to Kernel Ridge Regression in Predicting LncRNA-Protein Interactions.

作者信息

Shen Cong, Ding Yijie, Tang Jijun, Guo Fei

机构信息

School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China.

School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, China.

出版信息

Front Genet. 2019 Jan 15;9:716. doi: 10.3389/fgene.2018.00716. eCollection 2018.

Abstract

Long non-coding RNAs (lncRNAs) constitute a large class of transcribed RNA molecules. They have a characteristic length of more than 200 nucleotides which do not encode proteins. They play an important role in regulating gene expression by interacting with the homologous RNA-binding proteins. Due to the laborious and time-consuming nature of wet experimental methods, more researchers should pay great attention to computational approaches for the prediction of lncRNA-protein interaction (LPI). An in-depth literature review in the state-of-the-art investigations, leads to the conclusion that there is still room for improving the accuracy and velocity. This paper propose a novel method for identifying LPI by employing Kernel Ridge Regression, based on Fast Kernel Learning (LPI-FKLKRR). This approach, uses four distinct similarity measures for lncRNA and protein space, respectively. It is remarkable, that we extract Gene Ontology (GO) with proteins, in order to improve the quality of information in protein space. The process of heterogeneous kernels integration, applies Fast Kernel Learning (FastKL) to deal with weight optimization. The extrapolation model is obtained by gaining the ultimate prediction associations, after using Kernel Ridge Regression (KRR). Experimental outcomes show that the ability of modeling with LPI-FKLKRR has extraordinary performance compared with LPI prediction schemes. On benchmark dataset, it has been observed that the best Area Under Precision Recall Curve (AUPR) of 0.6950 is obtained by our proposed model LPI-FKLKRR, which outperforms the integrated LPLNP (AUPR: 0.4584), RWR (AUPR: 0.2827), CF (AUPR: 0.2357), LPIHN (AUPR: 0.2299), and LPBNI (AUPR: 0.3302). Also, combined with the experimental results of a case study on a novel dataset, it is anticipated that LPI-FKLKRR will be a useful tool for LPI prediction.

摘要

长链非编码RNA(lncRNAs)构成了一大类转录RNA分子。它们具有超过200个核苷酸的特征长度,不编码蛋白质。它们通过与同源RNA结合蛋白相互作用,在调节基因表达中发挥重要作用。由于湿实验方法费力且耗时,更多研究人员应高度关注用于预测lncRNA-蛋白质相互作用(LPI)的计算方法。对最新研究进行深入的文献综述后得出结论,在提高准确性和速度方面仍有改进空间。本文提出了一种基于快速核学习的核岭回归方法来识别LPI(LPI-FKLKRR)。该方法分别对lncRNA和蛋白质空间使用四种不同的相似性度量。值得注意的是,我们提取蛋白质的基因本体(GO)以提高蛋白质空间中的信息质量。异构核整合过程应用快速核学习(FastKL)来处理权重优化。在使用核岭回归(KRR)后,通过获得最终预测关联得到外推模型。实验结果表明,与LPI预测方案相比,LPI-FKLKRR的建模能力具有卓越性能。在基准数据集上,我们提出的模型LPI-FKLKRR获得了0.6950的最佳精确召回率曲线下面积(AUPR),优于集成的LPLNP(AUPR:0.4584)、RWR(AUPR:0.2827)、CF(AUPR:0.2357)、LPIHN(AUPR:0.2299)和LPBNI(AUPR:0.3302)。此外,结合在一个新数据集上的案例研究实验结果,预计LPI-FKLKRR将成为LPI预测的有用工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c938/6340980/b26c5353ad13/fgene-09-00716-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验