Mohammadparast Saeid, Bayat Hadi, Biglarian Akbar, Ohadi Mina
Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran.
Am J Primatol. 2014 Aug;76(8):747-56. doi: 10.1002/ajp.22266. Epub 2014 Feb 21.
Adaptive evolution may be linked with the genomic distribution and function of short tandem repeats (STRs). Proximity of the core promoter STRs to the +1 transcription start site (TSS), and their mutable nature are characteristics that highlight those STRs as a novel source of interspecies variation. The PAXBP1 gene (alternatively known as GCFC1) core promoter contains the longest STR identified in a Homo sapiens gene core promoter. Indeed, this core promoter is a stretch of four consecutive CT-STRs. In the current study, we used the Ensembl, NCBI, and UCSC databases to analyze the evolutionary trend and functional implication of this CT-STR complex in six major lineages across vertebrates, including primates, non-primate mammals, birds, reptiles, amphibians, and fish. We observed exceptional expansion (≥4-repeats) and conservation of this CT-STR complex across primates, except prosimians, Microcebus murinus and Otolemur garnettii (Fisher exact P<4.1×10(-7)). H. sapiens has the most complex STR formula, and longest repeats. Macaca mulatta and Callithrix jacchus monkeys have the simplest STR formulas, and shortest repeat numbers. CT≥4-repeats were not detected in non-primate lineages. Different length alleles across the PAXBP1 core promoter CT-STRs significantly altered gene expression in vitro (P<0.001, t-test). PAXBP1 has a crucial role in craniofacial development, myogenesis, and spine morphogenesis, properties that have been diverged between primates and non-primates. To our knowledge, this is the first instance of expansion and conservation of a STR complex co-occurring specifically with the primate lineage.
适应性进化可能与短串联重复序列(STR)的基因组分布和功能相关。核心启动子STR与转录起始位点(TSS)的+1位点接近,以及它们的可变性质,这些特征使得这些STR成为物种间变异的新来源。PAXBP1基因(也称为GCFC1)的核心启动子包含在人类基因核心启动子中鉴定出的最长STR。实际上,这个核心启动子是一段连续的四个CT-STR序列。在本研究中,我们使用Ensembl、NCBI和UCSC数据库来分析这种CT-STR复合体在包括灵长类动物、非灵长类哺乳动物、鸟类、爬行动物、两栖动物和鱼类在内的脊椎动物六大主要谱系中的进化趋势和功能意义。我们观察到,除了原猴亚目、小鼠狐猴和加氏婴猴外,这种CT-STR复合体在灵长类动物中出现了异常扩展(≥4次重复)并具有保守性(Fisher精确检验P<4.1×10⁻⁷)。人类具有最复杂的STR模式和最长的重复序列。恒河猴和狨猴的STR模式最简单,重复次数最短。在非灵长类谱系中未检测到CT≥4次重复。PAXBP1核心启动子CT-STR的不同长度等位基因在体外显著改变了基因表达(P<0.001,t检验)。PAXBP1在颅面发育、肌肉生成和脊柱形态发生中起关键作用,这些特性在灵长类和非灵长类动物之间存在差异。据我们所知,这是STR复合体的扩展和保守性首次专门与灵长类谱系同时出现的实例。