Xiao Xiaofang, Zhu Wen, Liao Bo, Xu Junlin, Gu Changlong, Ji Binbin, Yao Yuhua, Peng Lihong, Yang Jialiang
College of Information Science and Engineering, Hunan University, Changsha, China.
School of Mathematics and Statistics, Hainan Normal University, Haikou, China.
Front Genet. 2018 Oct 16;9:411. doi: 10.3389/fgene.2018.00411. eCollection 2018.
In recent years, it has been increasingly clear that long noncoding RNAs (lncRNAs) play critical roles in many biological processes associated with human diseases. Inferring potential lncRNA-disease associations is essential to reveal the secrets behind diseases, develop novel drugs, and optimize personalized treatments. However, biological experiments to validate lncRNA-disease associations are very time-consuming and costly. Thus, it is critical to develop effective computational models. In this study, we have proposed a method called BPLLDA to predict lncRNA-disease associations based on paths of fixed lengths in a heterogeneous lncRNA-disease association network. Specifically, BPLLDA first constructs a heterogeneous lncRNA-disease network by integrating the lncRNA-disease association network, the lncRNA functional similarity network, and the disease semantic similarity network. It then infers the probability of an lncRNA-disease association based on paths connecting them and their lengths in the network. Compared to existing methods, BPLLDA has a few advantages, including not demanding negative samples and the ability to predict associations related to novel lncRNAs or novel diseases. BPLLDA was applied to a canonical lncRNA-disease association database called LncRNADisease, together with two popular methods LRLSLDA and GrwLDA. The leave-one-out cross-validation areas under the receiver operating characteristic curve of BPLLDA are 0.87117, 0.82403, and 0.78528, respectively, for predicting overall associations, associations related to novel lncRNAs, and associations related to novel diseases, higher than those of the two compared methods. In addition, cervical cancer, glioma, and non-small-cell lung cancer were selected as case studies, for which the predicted top five lncRNA-disease associations were verified by recently published literature. In summary, BPLLDA exhibits good performances in predicting novel lncRNA-disease associations and associations related to novel lncRNAs and diseases. It may contribute to the understanding of lncRNA-associated diseases like certain cancers.
近年来,越来越明显的是,长链非编码RNA(lncRNA)在许多与人类疾病相关的生物学过程中发挥着关键作用。推断潜在的lncRNA-疾病关联对于揭示疾病背后的秘密、开发新药以及优化个性化治疗至关重要。然而,验证lncRNA-疾病关联的生物学实验非常耗时且成本高昂。因此,开发有效的计算模型至关重要。在本研究中,我们提出了一种名为BPLLDA的方法,用于基于异质lncRNA-疾病关联网络中固定长度的路径来预测lncRNA-疾病关联。具体而言,BPLLDA首先通过整合lncRNA-疾病关联网络、lncRNA功能相似性网络和疾病语义相似性网络来构建异质lncRNA-疾病网络。然后,它根据网络中连接它们的路径及其长度来推断lncRNA-疾病关联的概率。与现有方法相比,BPLLDA具有一些优点,包括不需要负样本以及能够预测与新型lncRNA或新型疾病相关的关联。BPLLDA被应用于一个名为LncRNADisease的经典lncRNA-疾病关联数据库,以及两种流行的方法LRLSLDA和GrwLDA。对于预测总体关联、与新型lncRNA相关的关联以及与新型疾病相关的关联,BPLLDA在留一法交叉验证下的受试者工作特征曲线下面积分别为0.87117、0.82403和0.78528,高于两种比较方法。此外,选择宫颈癌、胶质瘤和非小细胞肺癌作为案例研究,其预测的前五个lncRNA-疾病关联已被最近发表的文献验证。总之,BPLLDA在预测新型lncRNA-疾病关联以及与新型lncRNA和疾病相关的关联方面表现出良好的性能。它可能有助于理解某些癌症等与lncRNA相关的疾病。