Taguchi Y-h
BMC Bioinformatics. 2015;16 Suppl 18(Suppl 18):S16. doi: 10.1186/1471-2105-16-S18-S16. Epub 2015 Dec 9.
Transgenerational epigenetics (TGE) are currently considered important in disease, but the mechanisms involved are not yet fully understood. TGE abnormalities expected to cause disease are likely to be initiated during development and to be mediated by aberrant gene expression associated with aberrant promoter methylation that is heritable between generations. However, because methylation is removed and then re-established during development, it is not easy to identify promoter methylation abnormalities by comparing normal lineages with those expected to exhibit TGE abnormalities.
This study applied the recently proposed principal component analysis (PCA)-based unsupervised feature extraction to previously reported and publically available gene expression/promoter methylation profiles of rat primordial germ cells, between E13 and E16 of the F3 generation vinclozolin lineage that are expected to exhibit TGE abnormalities, to identify multiple genes that exhibited aberrant gene expression/promoter methylation during development.
The biological feasibility of the identified genes were tested via enrichment analyses of various biological concepts including pathway analysis, gene ontology terms and protein-protein interactions. All validations suggested superiority of the proposed method over three conventional and popular supervised methods that employed t test, limma and significance analysis of microarrays, respectively. The identified genes were globally related to tumors, the prostate, kidney, testis and the immune system and were previously reported to be related to various diseases caused by TGE.
Among the genes reported by PCA-based unsupervised feature extraction, we propose that chemokine signaling pathways and leucine rich repeat proteins are key factors that initiate transgenerational epigenetic-mediated diseases, because multiple genes included in these two categories were identified in this study.
跨代表观遗传学(TGE)目前被认为在疾病中很重要,但其涉及的机制尚未完全明确。预计会导致疾病的TGE异常可能在发育过程中启动,并由与异常启动子甲基化相关的异常基因表达介导,这种甲基化在代与代之间是可遗传的。然而,由于甲基化在发育过程中会被去除然后重新建立,通过比较正常谱系与预期表现出TGE异常的谱系来识别启动子甲基化异常并不容易。
本研究将最近提出的基于主成分分析(PCA)的无监督特征提取方法应用于先前报道的、公开可用的大鼠原始生殖细胞基因表达/启动子甲基化谱,这些细胞来自F3代乙烯菌核利谱系的E13至E16期,预计会表现出TGE异常,以识别在发育过程中表现出异常基因表达/启动子甲基化的多个基因。
通过对包括通路分析、基因本体术语和蛋白质-蛋白质相互作用在内的各种生物学概念进行富集分析,测试了所识别基因的生物学可行性。所有验证均表明,所提出的方法优于分别采用t检验、limma和微阵列显著性分析的三种传统且常用的监督方法。所识别的基因总体上与肿瘤、前列腺、肾脏、睾丸和免疫系统相关,并且先前报道与TGE引起的各种疾病有关。
在基于PCA的无监督特征提取所报告的基因中,我们提出趋化因子信号通路和富含亮氨酸重复蛋白是引发跨代表观遗传介导疾病的关键因素,因为在本研究中鉴定出了这两类中的多个基因。