Suppr超能文献

一种通过整合DNA甲基化和基因表达来识别疾病相关基因模块的基于网络的方法。

A network-based approach to identify disease-associated gene modules through integrating DNA methylation and gene expression.

作者信息

Zhang Yuanyuan, Zhang Junying, Liu Zhaowen, Liu Yajun, Tuo Shouheng

机构信息

School of Computer Science and Technology, Xidian University, Xi'an 710071, PR China.

School of Computer Science and Technology, Xidian University, Xi'an 710071, PR China.

出版信息

Biochem Biophys Res Commun. 2015 Sep 25;465(3):437-42. doi: 10.1016/j.bbrc.2015.08.033. Epub 2015 Aug 14.

Abstract

Formation and progression of complex diseases are generally the joint effect of genetic and epigenetic disorders, thus an integrative analysis of epigenetic and genetic data is essential for understanding mechanism of the diseases. In this study, we integrate Illuminate 450k DNA methylation and gene expression data to calculate the weights of gene network using Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA). The approach considers all methylation values of CpG sites in a gene, rather than averaging them which was used in other studies ignoring the variability of the methylation sites. Through comparing topological features of control network with those of case network, including global and local features, candidate disease-associated genes and gene modules are identified. We apply the approach to real data, breast invasive carcinoma (BRCA). It successfully identifies susceptibility breast cancer-related genes, such as TP53, BRCA1, EP300, CDK2, MCM7 and so forth, within which most are previously known to breast cancer. Also, GO and pathway enrichment analysis indicate that these genes enrich in cell apoptosis and regulation of cell death which are cancer-related biological processes. Importantly, through analyzing the functions and comparing expression and methylation values of these genes between cases and controls, we find some genes, such as VASN, SNRPD3, and gene modules, targeted by POLR2C, CHMP1B and TAF9, which might be novel breast cancer-related biomarkers.

摘要

复杂疾病的形成和进展通常是遗传和表观遗传紊乱的共同作用,因此对表观遗传和遗传数据进行综合分析对于理解疾病机制至关重要。在本研究中,我们整合了Illuminate 450k DNA甲基化和基因表达数据,使用主成分分析(PCA)和典型相关分析(CCA)来计算基因网络的权重。该方法考虑了基因中CpG位点的所有甲基化值,而不是像其他研究那样对其进行平均,其他研究忽略了甲基化位点的变异性。通过比较对照网络和病例网络的拓扑特征,包括全局和局部特征,识别出候选疾病相关基因和基因模块。我们将该方法应用于真实数据,即乳腺浸润性癌(BRCA)。它成功地识别出了与乳腺癌易感性相关的基因,如TP53、BRCA1、EP300、CDK2、MCM7等,其中大多数是先前已知与乳腺癌相关的。此外,基因本体(GO)和通路富集分析表明,这些基因富集于细胞凋亡和细胞死亡调控,这些都是与癌症相关的生物学过程。重要的是,通过分析这些基因的功能,并比较病例组和对照组之间这些基因的表达和甲基化值,我们发现了一些基因,如VASN、SNRPD3,以及由POLR2C、CHMP1B和TAF9靶向的基因模块,它们可能是新型的乳腺癌相关生物标志物。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验