Ederveen Thomas H A, Mandemaker Imke K, Logie Colin
Department of Molecular Biology, Nijmegen Centre for Molecular Life Sciences, Radboud University, The Netherlands.
Biochim Biophys Acta. 2011 Oct;1809(10):577-86. doi: 10.1016/j.bbagrm.2011.07.002. Epub 2011 Jul 14.
Histones are highly basic, relatively small proteins that complex with DNA to form higher order structures that underlie chromosome topology. Of the four core histones H2A, H2B, H3 and H4, it is H3 that is most heavily modified at the post-translational level. The human genome harbours 16 annotated bona fide histone H3 genes which code for four H3 protein variants. In 2010, two novel histone H3.3 protein variants were reported, carrying over twenty amino acid substitutions. Nevertheless, they appear to be incorporated into chromatin. Interestingly, these new H3 genes are located on human chromosome 5 in a repetitive region that harbours an additional five H3 pseudogenes, but no other core histone ORFs. In addition, a human-specific novel putative histone H3.3 variant located at 12p11.21 was reported in 2011. These developments raised the question as to how many more human histone H3 ORFs there may be. Using homology searches, we detected 41 histone H3 pseudogenes in the current human genome assembly. The large majority are derived from the H3.3 gene H3F3A, and three of those may code for yet more histone H3.3 protein variants. We also identified one extra intact H3.2-type variant ORF in the vicinity of the canonical HIST2 gene cluster at chromosome 1p21.2. RNA polymerase II occupancy data revealed heterogeneity in H3 gene expression in human cell lines. None of the novel H3 genes were significantly occupied by RNA polymerase II in the data sets at hand, however. We discuss the implications of these recent developments.
组蛋白是高度碱性、相对较小的蛋白质,它们与DNA结合形成构成染色体拓扑结构基础的高级结构。在四种核心组蛋白H2A、H2B、H3和H4中,H3在翻译后水平上的修饰最为广泛。人类基因组包含16个注释的真正组蛋白H3基因,它们编码四种H3蛋白变体。2010年,报道了两种新的组蛋白H3.3蛋白变体,携带超过二十个氨基酸替换。然而,它们似乎被整合到染色质中。有趣的是,这些新的H3基因位于人类5号染色体的一个重复区域,该区域还包含另外五个H3假基因,但没有其他核心组蛋白开放阅读框。此外,2011年报道了一种位于12p11.21的人类特异性新型假定组蛋白H3.3变体。这些进展引发了一个问题,即人类组蛋白H3开放阅读框可能还有多少。通过同源性搜索,我们在当前的人类基因组组装中检测到41个组蛋白H3假基因。绝大多数源自H3.3基因H3F3A,其中三个可能编码更多的组蛋白H3.3蛋白变体。我们还在1号染色体1p21.2处的经典HIST2基因簇附近鉴定出一个额外的完整H3.2型变体开放阅读框。RNA聚合酶II占据数据揭示了人类细胞系中H3基因表达的异质性。然而,在手头的数据集中,没有一个新的H3基因被RNA聚合酶II显著占据。我们讨论了这些最新进展的意义。