Tan Haojiang, Sun Quanmeng, Li Guanghui, Xiao Qiu, Ding Pingjian, Luo Jiawei, Liang Cheng
School of Information Science and Engineering, Shandong Normal University, Jinan, China.
School of Information Engineering, East China Jiaotong University, Nanchang, China.
Front Genet. 2020 Feb 21;11:89. doi: 10.3389/fgene.2020.00089. eCollection 2020.
Long noncoding RNAs (lncRNAs) are a class of noncoding RNA molecules longer than 200 nucleotides. Recent studies have uncovered their functional roles in diverse cellular processes and tumorigenesis. Therefore, identifying novel disease-related lncRNAs might deepen our understanding of disease etiology. However, due to the relatively small number of verified associations between lncRNAs and diseases, it remains a challenging task to reliably and effectively predict the associated lncRNAs for given diseases. In this paper, we propose a novel multiview consensus graph learning method to infer potential disease-related lncRNAs. Specifically, we first construct a set of similarity matrices for lncRNAs and diseases by taking advantage of the known associations. We then iteratively learn a consensus graph from the multiple input matrices and simultaneously optimize the predicted association probability based on a multi-label learning framework. To convey the utility of our method, three state-of-the-art methods are compared with our method on three widely used datasets. The experiment results illustrate that our method could obtain the best prediction performance under different cross validation schemes. The case study analysis implemented for uterine cervical neoplasms further confirmed the utility of our method in identifying lncRNAs as potential prognostic biomarkers in practice.
长链非编码RNA(lncRNAs)是一类长度超过200个核苷酸的非编码RNA分子。最近的研究揭示了它们在多种细胞过程和肿瘤发生中的功能作用。因此,鉴定新的疾病相关lncRNAs可能会加深我们对疾病病因的理解。然而,由于lncRNAs与疾病之间已证实的关联相对较少,可靠且有效地预测给定疾病的相关lncRNAs仍然是一项具有挑战性的任务。在本文中,我们提出了一种新颖的多视图共识图学习方法来推断潜在的疾病相关lncRNAs。具体而言,我们首先利用已知的关联构建一组lncRNAs和疾病的相似性矩阵。然后,我们从多个输入矩阵中迭代学习一个共识图,并基于多标签学习框架同时优化预测的关联概率。为了说明我们方法的实用性,在三个广泛使用的数据集上,将三种最先进的方法与我们的方法进行了比较。实验结果表明,在不同的交叉验证方案下,我们的方法能够获得最佳的预测性能。对子宫颈肿瘤进行的案例研究分析进一步证实了我们的方法在实际中识别lncRNAs作为潜在预后生物标志物的实用性。