College of Information Science and Engineering, Hunan University, Changsha, Hunan, 410082, China.
School of Computer and Information Science, Hunan Institute of Technology, Hengyang, 412002, China.
Sci Rep. 2017 Sep 29;7(1):12442. doi: 10.1038/s41598-017-12763-z.
There is more and more evidence that the mutation and dysregulation of long non-coding RNA (lncRNA) are associated with numerous diseases, including cancers. However, experimental methods to identify associations between lncRNAs and diseases are expensive and time-consuming. Effective computational approaches to identify disease-related lncRNAs are in high demand; and would benefit the detection of lncRNA biomarkers for disease diagnosis, treatment, and prevention. In light of some limitations of existing computational methods, we developed a global network random walk model for predicting lncRNA-disease associations (GrwLDA) to reveal the potential associations between lncRNAs and diseases. GrwLDA is a universal network-based method and does not require negative samples. This method can be applied to a disease with no known associated lncRNA (isolated disease) and to lncRNA with no known associated disease (novel lncRNA). The leave-one-out cross validation (LOOCV) method was implemented to evaluate the predicted performance of GrwLDA. As a result, GrwLDA obtained reliable AUCs of 0.9449, 0.8562, and 0.8374 for overall, novel lncRNA and isolated disease prediction, respectively, significantly outperforming previous methods. Case studies of colon, gastric, and kidney cancers were also implemented, and the top 5 disease-lncRNA associations were reported for each disease. Interestingly, 13 (out of the 15) associations were confirmed by literature mining.
越来越多的证据表明,长非编码 RNA(lncRNA)的突变和失调与许多疾病有关,包括癌症。然而,识别 lncRNA 与疾病之间关联的实验方法既昂贵又耗时。因此,需要有效的计算方法来识别与疾病相关的 lncRNA,这将有助于发现用于疾病诊断、治疗和预防的 lncRNA 生物标志物。鉴于现有计算方法的一些局限性,我们开发了一种用于预测 lncRNA-疾病关联的全局网络随机游走模型(GrwLDA),以揭示 lncRNA 与疾病之间的潜在关联。GrwLDA 是一种通用的基于网络的方法,不需要负样本。该方法可应用于尚无已知相关 lncRNA 的疾病(孤立疾病)和尚无已知相关疾病的 lncRNA(新型 lncRNA)。采用留一法交叉验证(LOOCV)方法来评估 GrwLDA 的预测性能。结果表明,GrwLDA 在整体、新型 lncRNA 和孤立疾病预测方面的可靠 AUC 分别为 0.9449、0.8562 和 0.8374,显著优于以前的方法。还对结肠癌、胃癌和肾癌进行了案例研究,并报告了每种疾病的前 5 个疾病-lncRNA 关联。有趣的是,通过文献挖掘证实了 13(15 个中的 13 个)个关联。