Shoja Valia, Zhang Liqing
Department of Computer Science, Virginia Tech, Blacksburg, Virginia, VA, USA.
Mol Biol Evol. 2006 Nov;23(11):2134-41. doi: 10.1093/molbev/msl085. Epub 2006 Aug 10.
Tandemly arrayed genes (TAGs) play an important functional and physiological role in the genome. Most previous studies have focused on individual TAG families in a few species, yet a broad characterization of TAGs is not available. Here we identified all TAGs in the genomes of humans, mouse, and rat and performed a comprehensive analysis of TAG distribution, TAG sizes, TAG orientations and intergenic distances, and TAG functions. TAGs account for about 14-17% of all genes in the genome and nearly one-third of all duplicated genes, highlighting the predominant role that tandem duplication plays in gene duplication. For all species, TAG distribution is highly heterogeneous along chromosomes and some chromosomes are enriched with TAG forests, whereas others are enriched with TAG deserts. The majority of TAGs are of size 2 for all genomes, similar to the previous findings in Caenorhabditis elegans, Arabidopsis thaliana, and Oryza sativa, suggesting that it is a rather general phenomenon in eukaryotes. The comparison with the genome patterns shows that TAG members have a significantly higher proportion of parallel gene orientation in all species, corroborating Graham's claim that parallel orientation is the preferred form of orientation in TAGs. Moreover, TAG members with parallel orientation tend to be closer to each other than all neighboring genes in the genome with parallel orientation. The analyses of Gene Ontology function indicate that genes with receptor or binding activities are significantly overrepresented by TAGs. Computer simulation reveals that random gene rearrangements have little effect on the statistics of TAGs for all genomes. Finally, the average proportion of TAGs shows a trend of increase with the increase of family sizes, although the correlation between TAG proportions in individual families and family sizes is not significant.
串联排列基因(TAGs)在基因组中发挥着重要的功能和生理作用。以往大多数研究都集中在少数物种中的单个TAG家族上,然而目前尚无对TAGs的全面表征。在此,我们鉴定了人类、小鼠和大鼠基因组中的所有TAGs,并对TAGs的分布、TAGs大小、TAGs方向和基因间距离以及TAGs功能进行了全面分析。TAGs约占基因组中所有基因的14 - 17%,占所有重复基因的近三分之一,突出了串联重复在基因复制中所起的主导作用。对于所有物种而言,TAGs沿染色体的分布高度不均一,一些染色体富含TAGs森林,而另一些则富含TAGs荒漠。所有基因组中大多数TAGs的大小为2,这与先前在秀丽隐杆线虫、拟南芥和水稻中的研究结果相似,表明这在真核生物中是一种相当普遍的现象。与基因组模式的比较表明,在所有物种中,TAG成员具有平行基因方向的比例显著更高,这证实了格雷厄姆的观点,即平行方向是TAGs中首选的方向形式。此外,具有平行方向的TAG成员往往比基因组中具有平行方向的所有相邻基因彼此更靠近。基因本体功能分析表明,具有受体或结合活性的基因在TAGs中显著富集。计算机模拟显示,随机基因重排对所有基因组中TAGs的统计数据影响很小。最后,TAGs的平均比例呈现出随家族大小增加而增加的趋势,尽管单个家族中TAG比例与家族大小之间的相关性并不显著。