Zhao Jianbang, Ma Xiaoke
College of Information Engineering, Northwest Agriculture & Forestry University, Xianyang, China.
School of Computer Science and Technology, Xidian University, Xi'an, China.
Front Genet. 2019 Jan 23;9:685. doi: 10.3389/fgene.2018.00685. eCollection 2018.
Long non-coding RNAs (LncRNA) are critical regulators for biological processes, which are highly related to complex diseases. Even though the next generation sequence technology facilitates the discovery of a great number of lncRNAs, the knowledge about the functions of lncRNAs is limited. Thus, it is promising to predict the functions of lncRNAs, which shed light on revealing the mechanisms of complex diseases. The current algorithms predict the functions of lncRNA by using the features of protein-coding genes. Generally speaking, these algorithms fuse heterogeneous genomic data to construct lncRNA-gene associations via a linear combination, which cannot fully characterize the function-lncRNA relations. To overcome this issue, we present an nonnegative matrix factorization algorithm with multiple partial regularization (aka MPrNMF) to predict the functions of lncRNAs without fusing the heterogeneous genomic data. In details, for each type of genomic data, we construct the lncRNA-gene associations, resulting in multiple associations. The proposed method integrates separately them via regularization strategy, rather than fuse them into a single type of associations. The results demonstrate that the proposed algorithm outperforms state-of-the-art methods based network-analysis. The model and algorithm provide an effective way to explore the functions of lncRNAs.
长链非编码RNA(LncRNA)是生物过程的关键调节因子,与复杂疾病高度相关。尽管下一代测序技术有助于发现大量lncRNAs,但关于lncRNAs功能的知识仍然有限。因此,预测lncRNAs的功能很有前景,这有助于揭示复杂疾病的机制。当前的算法通过使用蛋白质编码基因的特征来预测lncRNA的功能。一般来说,这些算法通过线性组合融合异质基因组数据来构建lncRNA-基因关联,这不能完全表征功能与lncRNA的关系。为了克服这个问题,我们提出了一种具有多重局部正则化的非负矩阵分解算法(即MPrNMF),用于在不融合异质基因组数据的情况下预测lncRNAs的功能。具体而言,对于每种类型的基因组数据,我们构建lncRNA-基因关联,从而产生多个关联。所提出的方法通过正则化策略分别整合它们,而不是将它们融合成单一类型的关联。结果表明,所提出的算法优于基于网络分析的现有方法。该模型和算法为探索lncRNAs的功能提供了一种有效方法。