Program in Molecular Structure & Function, Hospital for Sick Children, Toronto, Canada.
PLoS One. 2010 Nov 29;5(11):e14122. doi: 10.1371/journal.pone.0014122.
Chromatin modification (CM) plays a key role in regulating transcription, DNA replication, repair and recombination. However, our knowledge of these processes in humans remains very limited. Here we use computational approaches to study proteins and functional domains involved in CM in humans. We analyze the abundance and the pair-wise domain-domain co-occurrences of 25 well-documented CM domains in 5 model organisms: yeast, worm, fly, mouse and human. Results show that domains involved in histone methylation, DNA methylation, and histone variants are remarkably expanded in metazoan, reflecting the increased demand for cell type-specific gene regulation. We find that CM domains tend to co-occur with a limited number of partner domains and are hence not promiscuous. This property is exploited to identify 47 potentially novel CM domains, including 24 DNA-binding domains, whose role in CM has received little attention so far. Lastly, we use a consensus Machine Learning approach to predict 379 novel CM genes (coding for 329 proteins) in humans based on domain compositions. Several of these predictions are supported by very recent experimental studies and others are slated for experimental verification. Identification of novel CM genes and domains in humans will aid our understanding of fundamental epigenetic processes that are important for stem cell differentiation and cancer biology. Information on all the candidate CM domains and genes reported here is publicly available.
染色质修饰(CM)在调节转录、DNA 复制、修复和重组中起着关键作用。然而,我们对人类这些过程的了解仍然非常有限。在这里,我们使用计算方法研究人类 CM 中涉及的蛋白质和功能结构域。我们分析了 5 种模式生物(酵母、线虫、果蝇、小鼠和人类)中 25 个有充分文献记载的 CM 结构域的丰度和两两结构域-结构域共现。结果表明,涉及组蛋白甲基化、DNA 甲基化和组蛋白变体的结构域在后生动物中显著扩张,反映了对细胞类型特异性基因调控的需求增加。我们发现 CM 结构域往往与有限数量的伴侣结构域共现,因此不是混杂的。这种特性被用来识别 47 个潜在的新 CM 结构域,包括 24 个 DNA 结合结构域,其在 CM 中的作用迄今尚未得到重视。最后,我们使用共识机器学习方法,根据结构域组成,预测人类 379 个新的 CM 基因(编码 329 种蛋白质)。这些预测中有几个得到了最近实验研究的支持,其他的则有待于实验验证。在人类中鉴定新的 CM 基因和结构域将有助于我们理解对干细胞分化和癌症生物学很重要的基本表观遗传过程。这里报告的所有候选 CM 结构域和基因的信息都是公开的。