Dou Zengfa, Ma Xiaoke
The 20-th Research Institute, China Electronics Technology Group Corporation, Xi'an, China.
School of Computer Science and Technology, Xidian University, Xi'an, China.
Front Genet. 2021 Aug 24;12:706952. doi: 10.3389/fgene.2021.706952. eCollection 2021.
Gene expression and methylation are critical biological processes for cells, and how to integrate these heterogeneous data has been extensively investigated, which is the foundation for revealing the underlying patterns of cancers. The vast majority of the current algorithms fuse gene methylation and expression into a network, failing to fully explore the relations and heterogeneity of them. To resolve these problems, in this study we define the epigenetic modules as a gene set whose members are co-methylated and co-expressed. To address the heterogeneity of data, we construct gene co-expression and co-methylation networks, respectively. In this case, the epigenetic module is characterized as a common module in multiple networks. Then, a non-negative matrix factorization-based algorithm that jointly clusters the co-expression and co-methylation networks is proposed for discovering the epigenetic modules (called Ep-jNMF). Ep-jNMF is more accurate than the baselines on the artificial data. Moreover, Ep-jNMF identifies more biologically meaningful modules. And the modules can predict the subtypes of cancers. These results indicate that Ep-jNMF is efficient for the integration of expression and methylation data.
基因表达和甲基化是细胞的关键生物学过程,如何整合这些异质数据已得到广泛研究,这是揭示癌症潜在模式的基础。目前绝大多数算法将基因甲基化和表达融合到一个网络中,未能充分探索它们之间的关系和异质性。为了解决这些问题,在本研究中,我们将表观遗传模块定义为一个基因集,其成员是共甲基化和共表达的。为了应对数据的异质性,我们分别构建了基因共表达网络和共甲基化网络。在这种情况下,表观遗传模块被表征为多个网络中的共同模块。然后,提出了一种基于非负矩阵分解的算法,用于联合聚类共表达网络和共甲基化网络以发现表观遗传模块(称为Ep-jNMF)。在人工数据上,Ep-jNMF比基线方法更准确。此外,Ep-jNMF识别出更多具有生物学意义的模块。并且这些模块可以预测癌症的亚型。这些结果表明Ep-jNMF在整合表达和甲基化数据方面是有效的。