CSIRO Livestock Industries, Queensland Bioscience Precinct, 306 Carmody Rd, St Lucia, Queensland 4067, Australia.
BMC Genomics. 2010 Nov 23;11:654. doi: 10.1186/1471-2164-11-654.
About forty human diseases are caused by repeat instability mutations. A distinct subset of these diseases is the result of extreme expansions of polymorphic trinucleotide repeats; typically CAG repeats encoding poly-glutamine (poly-Q) tracts in proteins. Polymorphic repeat length variation is also apparent in human poly-Q encoding genes from normal individuals. As these coding sequence repeats are subject to selection in mammals, it has been suggested that normal variations in some of these typically highly conserved genes are implicated in morphological differences between species and phenotypic variations within species. At present, poly-Q encoding genes in non-human mammalian species are poorly documented, as are their functions and propensities for polymorphic variation.
The current investigation identified 178 bovine poly-Q encoding genes (Q ≥ 5) and within this group, 26 genes with orthologs in both human and mouse that did not contain poly-Q repeats. The bovine poly-Q encoding genes typically had ubiquitous expression patterns although there was bias towards expression in epithelia, brain and testes. They were also characterised by unusually large sizes. Analysis of gene ontology terms revealed that the encoded proteins were strongly enriched for functions associated with transcriptional regulation and many contributed to physical interaction networks in the nucleus where they presumably act cooperatively in transcriptional regulatory complexes. In addition, the coding sequence CAG repeats in some bovine genes impacted mRNA splicing thereby generating unusual transcriptional diversity, which in at least one instance was tissue-specific. The poly-Q encoding genes were prioritised using multiple criteria for their likelihood of being polymorphic and then the highest ranking group was experimentally tested for polymorphic variation within a cattle diversity panel. Extensive and meiotically stable variation was identified.
Transcriptional diversity can potentially be generated in poly-Q encoding genes by the impact of CAG repeat tracts on mRNA alternative splicing. This effect, combined with the physical interactions of the encoded proteins in large transcriptional regulatory complexes suggests that polymorphic variations of proteins in these complexes have strong potential to affect phenotype.
约有四十种人类疾病是由重复不稳定突变引起的。这些疾病中有一个明显的子集是由于多态三核苷酸重复的极端扩展引起的;通常是编码多聚谷氨酰胺(poly-Q)的 CAG 重复序列在蛋白质中。在正常个体的人类多聚 Q 编码基因中也可以看到多态重复长度的变化。由于这些编码序列重复在哺乳动物中受到选择,因此有人认为,这些通常高度保守的基因中的一些正常变异与物种之间的形态差异和物种内的表型变异有关。目前,非人类哺乳动物物种中的多聚 Q 编码基因及其功能和多态变异倾向记录甚少。
目前的研究确定了 178 个牛多聚 Q 编码基因(Q≥5),在这一组中,有 26 个基因在人和鼠中都有同源物,但不包含多聚 Q 重复序列。牛多聚 Q 编码基因通常具有广泛的表达模式,尽管在表皮、大脑和睾丸中表达有偏向性。它们还具有异常大的大小特征。对基因本体论术语的分析表明,编码的蛋白质强烈富集与转录调节相关的功能,其中许多功能在核内的物理相互作用网络中发挥作用,它们可能在转录调节复合物中协同作用。此外,一些牛基因中的编码序列 CAG 重复序列影响 mRNA 剪接,从而产生不寻常的转录多样性,至少在一种情况下是组织特异性的。根据其多态性的可能性,使用多种标准对多聚 Q 编码基因进行优先级排序,然后对最高排名的一组在牛多样性面板中进行多态性变异的实验测试。发现了广泛且稳定的变异。
CAG 重复序列对 mRNA 选择性剪接的影响,可能在多聚 Q 编码基因中产生转录多样性。这种效应,加上编码蛋白在大型转录调节复合物中的物理相互作用,表明这些复合物中的蛋白质多态性变异具有强烈影响表型的潜力。