School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA.
Nucleic Acids Res. 2012 Nov;40(21):10642-56. doi: 10.1093/nar/gks848. Epub 2012 Sep 18.
We report on the development of an unsupervised algorithm for the genome-wide discovery and analysis of chromatin signatures. Our Chromatin-profile Alignment followed by Tree-clustering algorithm (ChAT) employs dynamic programming of combinatorial histone modification profiles to identify locally similar chromatin sub-regions and provides complementary utility with respect to existing methods. We applied ChAT to genomic maps of 39 histone modifications in human CD4(+) T cells to identify both known and novel chromatin signatures. ChAT was able to detect chromatin signatures previously associated with transcription start sites and enhancers as well as novel signatures associated with a variety of regulatory elements. Promoter-associated signatures discovered with ChAT indicate that complex chromatin signatures, made up of numerous co-located histone modifications, facilitate cell-type specific gene expression. The discovery of novel L1 retrotransposon-associated bivalent chromatin signatures suggests that these elements influence the mono-allelic expression of human genes by shaping the chromatin environment of imprinted genomic regions. Analysis of long gene-associated chromatin signatures point to a role for the H4K20me1 and H3K79me3 histone modifications in transcriptional pause release. The novel chromatin signatures and functional associations uncovered by ChAT underscore the ability of the algorithm to yield novel insight on chromatin-based regulatory mechanisms.
我们报告了一种用于全基因组发现和分析染色质特征的无监督算法的开发。我们的染色质构象比对和树聚类算法(ChAT)采用组合组蛋白修饰谱的动态编程来识别局部相似的染色质亚区,并提供了相对于现有方法的补充功能。我们将 ChAT 应用于人类 CD4(+) T 细胞中的 39 种组蛋白修饰的基因组图谱,以识别已知和新的染色质特征。ChAT 能够检测到先前与转录起始位点和增强子相关的染色质特征,以及与各种调节元件相关的新特征。ChAT 发现的与启动子相关的特征表明,由许多共定位的组蛋白修饰组成的复杂染色质特征有助于细胞类型特异性基因表达。新型 L1 反转录转座子相关的二价染色质特征的发现表明,这些元件通过塑造印迹基因组区域的染色质环境来影响人类基因的单等位基因表达。与长基因相关的染色质特征的分析表明,H4K20me1 和 H3K79me3 组蛋白修饰在转录暂停释放中起作用。ChAT 发现的新染色质特征和功能关联强调了该算法在基于染色质的调控机制方面产生新见解的能力。