Luo Jiawei, Xiang Gen, Pan Chu
IEEE Trans Nanobioscience. 2017 Jan;16(1):51-59. doi: 10.1109/TNB.2017.2649560. Epub 2017 Jan 9.
It is well known that regulators known as microRNA (miRNA) and transcription factor (TF) have been found to play an important role in gene regulation. However, there are few researches of collaborative regulatory (co-regulatory) mechanism between miRNA and TF on system level (function level). Meanwhile, recent advances in high-throughput genomic technologies have enabled researchers to collect diverse large-scale genomic data, which can be used to study the co-regulatory mechanism between miRNA and TF. In this paper, we propose a novel method called Sparse Network regularized non-negative matrix factorization for co-regulatory modules identification which adopts multiple non-negative matrix factorization framework to identify co-regulatory modules including miRNAs, TFs and genes. This method jointly integrates miRNA, TF and gene expression profiles, and additional priori networks were added in a regularized manner. In addition, to avoid the sparsity of these networks, we employ the sparsity penalties to the variables to achieve modular solutions. The mathematical formulation can be effectively solved by an iterative multiplicative updating algorithm. We apply this method to multiple genomic data including the expression profiles of miRNAs, TFs and genes on breast cancer obtained from TCGA, priori miRNA-gene regulations, TF-gene regulations and gene-gene interactions. The results show that the miRNAs, TFs and genes of the co-regulatory modules are significantly associated and modules have a reasonable size distribution. Furthermore, the co-regulatory modules are significantly enriched in Gene Ontology biological processes and Kyoto Encyclopedia of Genes and Genomes pathways, respectively.
众所周知,被称为微小RNA(miRNA)和转录因子(TF)的调节因子在基因调控中发挥着重要作用。然而,在系统水平(功能水平)上,关于miRNA和TF之间的协同调节(共调节)机制的研究却很少。与此同时,高通量基因组技术的最新进展使研究人员能够收集各种大规模基因组数据,这些数据可用于研究miRNA和TF之间的共调节机制。在本文中,我们提出了一种名为稀疏网络正则化非负矩阵分解的新方法,用于共调节模块识别,该方法采用多个非负矩阵分解框架来识别包括miRNA、TF和基因的共调节模块。该方法联合整合了miRNA、TF和基因表达谱,并以正则化方式添加了额外的先验网络。此外,为了避免这些网络的稀疏性,我们对变量采用稀疏惩罚以获得模块化解决方案。该数学公式可以通过迭代乘法更新算法有效求解。我们将此方法应用于多个基因组数据,包括从TCGA获得的乳腺癌的miRNA、TF和基因的表达谱、先验miRNA-基因调控、TF-基因调控和基因-基因相互作用。结果表明,共调节模块中的miRNA、TF和基因显著相关,且模块具有合理的大小分布。此外,共调节模块分别在基因本体生物学过程和京都基因与基因组百科全书通路中显著富集。