School of Software, Central South University, Changsha, 410075, China.
Lab of Information Management, Changzhou University, Jiangsu, 213164, China.
BMC Bioinformatics. 2018 Oct 11;19(1):370. doi: 10.1186/s12859-018-2390-0.
Identifying the interactions between proteins and long non-coding RNAs (lncRNAs) is of great importance to decipher the functional mechanisms of lncRNAs. However, current experimental techniques for detection of lncRNA-protein interactions are limited and inefficient. Many methods have been proposed to predict protein-lncRNA interactions, but few studies make use of the topological information of heterogenous biological networks associated with the lncRNAs.
In this work, we propose a novel approach, PLIPCOM, using two groups of network features to detect protein-lncRNA interactions. In particular, diffusion features and HeteSim features are extracted from protein-lncRNA heterogenous network, and then combined to build the prediction model using the Gradient Tree Boosting (GTB) algorithm. Our study highlights that the topological features of the heterogeneous network are crucial for predicting protein-lncRNA interactions. The cross-validation experiments on the benchmark dataset show that PLIPCOM method substantially outperformed previous state-of-the-art approaches in predicting protein-lncRNA interactions. We also prove the robustness of the proposed method on three unbalanced data sets. Moreover, our case studies demonstrate that our method is effective and reliable in predicting the interactions between lncRNAs and proteins.
The source code and supporting files are publicly available at: http://denglab.org/PLIPCOM/ .
鉴定蛋白质和长链非编码 RNA(lncRNA)之间的相互作用对于破译 lncRNA 的功能机制非常重要。然而,目前用于检测 lncRNA-蛋白质相互作用的实验技术有限且效率低下。已经提出了许多预测蛋白质-lncRNA 相互作用的方法,但很少有研究利用与 lncRNA 相关的异质生物网络的拓扑信息。
在这项工作中,我们提出了一种新的方法 PLIPCOM,使用两组网络特征来检测蛋白质-lncRNA 相互作用。特别是,从蛋白质-lncRNA 异质网络中提取扩散特征和 HeteSim 特征,然后使用梯度树提升(GTB)算法将它们组合来构建预测模型。我们的研究强调了异质网络的拓扑特征对于预测蛋白质-lncRNA 相互作用至关重要。在基准数据集上的交叉验证实验表明,PLIPCOM 方法在预测蛋白质-lncRNA 相互作用方面明显优于以前的最先进方法。我们还在三个不平衡数据集上证明了所提出方法的稳健性。此外,我们的案例研究表明,我们的方法在预测 lncRNA 和蛋白质之间的相互作用方面是有效和可靠的。
源代码和支持文件可在以下网址公开获取:http://denglab.org/PLIPCOM/ 。