College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China.
Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, 410081, China.
BMC Bioinformatics. 2019 Feb 7;20(1):67. doi: 10.1186/s12859-019-2654-3.
Non-coding RNAs (ncRNAs) are emerging as key regulators and play critical roles in a wide range of tumorigenesis. Recent studies have suggested that long non-coding RNAs (lncRNAs) could interact with microRNAs (miRNAs) and indirectly regulate miRNA targets through competing interactions. Therefore, uncovering the competing endogenous RNA (ceRNA) regulatory mechanism of lncRNAs, miRNAs and mRNAs in post-transcriptional level will aid in deciphering the underlying pathogenesis of human polygenic diseases and may unveil new diagnostic and therapeutic opportunities. However, the functional roles of vast majority of cancer specific ncRNAs and their combinational regulation patterns are still insufficiently understood.
Here we develop an integrative framework called CeModule to discover lncRNA, miRNA and mRNA-associated regulatory modules. We fully utilize the matched expression profiles of lncRNAs, miRNAs and mRNAs and establish a model based on joint orthogonality non-negative matrix factorization for identifying modules. Meanwhile, we impose the experimentally verified miRNA-lncRNA interactions, the validated miRNA-mRNA interactions and the weighted gene-gene network into this framework to improve the module accuracy through the network-based penalties. The sparse regularizations are also used to help this model obtain modular sparse solutions. Finally, an iterative multiplicative updating algorithm is adopted to solve the optimization problem.
We applied CeModule to two cancer datasets including ovarian cancer (OV) and uterine corpus endometrial carcinoma (UCEC) obtained from TCGA. The modular analysis indicated that the identified modules involving lncRNAs, miRNAs and mRNAs are significantly associated and functionally enriched in cancer-related biological processes and pathways, which may provide new insights into the complex regulatory mechanism of human diseases at the system level.
非编码 RNA(ncRNAs)作为关键调节剂不断涌现,在广泛的肿瘤发生中发挥着关键作用。最近的研究表明,长非编码 RNA(lncRNAs)可以与 microRNAs(miRNAs)相互作用,并通过竞争相互作用间接调节 miRNA 靶标。因此,揭示 lncRNAs、miRNAs 和 mRNAs 在转录后水平的竞争内源性 RNA(ceRNA)调控机制将有助于破译人类多基因疾病的潜在发病机制,并可能揭示新的诊断和治疗机会。然而,绝大多数癌症特异性 ncRNAs 的功能作用及其组合调控模式仍知之甚少。
在这里,我们开发了一种称为 CeModule 的综合框架,用于发现 lncRNA、miRNA 和 mRNA 相关的调节模块。我们充分利用 lncRNA、miRNA 和 mRNA 的匹配表达谱,并基于联合正交非负矩阵分解建立了一个模型,用于识别模块。同时,我们将经过实验验证的 miRNA-lncRNA 相互作用、已验证的 miRNA-mRNA 相互作用和加权基因-基因网络纳入该框架,通过网络惩罚来提高模块准确性。稀疏正则化也用于帮助该模型获得模块化稀疏解。最后,采用迭代乘法更新算法来解决优化问题。
我们将 CeModule 应用于两个癌症数据集,包括来自 TCGA 的卵巢癌(OV)和子宫体子宫内膜癌(UCEC)。模块分析表明,所鉴定的涉及 lncRNAs、miRNAs 和 mRNAs 的模块与癌症相关的生物学过程和途径显著相关且功能富集,这可能为人类疾病的复杂调控机制提供新的系统水平的见解。