College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China.
Gene. 2013 Apr 15;518(2):425-30. doi: 10.1016/j.gene.2012.12.022. Epub 2013 Jan 9.
Polycomb group (PcG) proteins are epigenetic regulators that are essential for stem cell differentiation. Identifying PcG binding profiles is important for understanding the mechanisms of PcG-mediated repression in mammals. We used a mapping-convergence (M-C) algorithm using support vector machine (SVM) technology for genome-wide identification of PcG target genes in human embryonic stem cells. The method combined histone modifications and transcription factor binding motifs, eliminating the need for negative training samples as in traditional SVM. Good prediction accuracy comprising 3-fold cross-validation was obtained. In the analysis of 3133 PcG target genes identified by the model, PcG proteins were observed to suppress gene expression during differentiation. The results suggested that PcG and DNA methylation non-redundantly repress gene expression during differentiation. The genome-wide identification of PcG target genes will aid the further analysis of PcG mechanisms.
多梳抑制复合物(PcG)蛋白是表观遗传调控因子,对于干细胞分化至关重要。鉴定 PcG 结合谱对于理解哺乳动物中 PcG 介导的抑制机制非常重要。我们使用了一种基于支持向量机(SVM)技术的映射收敛(M-C)算法,用于在人类胚胎干细胞中进行全基因组鉴定 PcG 靶基因。该方法结合了组蛋白修饰和转录因子结合基序,消除了传统 SVM 中需要阴性训练样本的需求。通过 3 倍交叉验证获得了良好的预测准确性。在对模型鉴定的 3133 个 PcG 靶基因的分析中,观察到 PcG 蛋白在分化过程中抑制基因表达。结果表明,PcG 和 DNA 甲基化在分化过程中非冗余地抑制基因表达。全基因组鉴定 PcG 靶基因将有助于进一步分析 PcG 机制。