Liu Renyi, Vitte Clémentine, Ma Jianxin, Mahama A Assibi, Dhliwayo Thanda, Lee Michael, Bennetzen Jeffrey L
Department of Genetics, University of Georgia, Athens, GA 30602, USA.
Proc Natl Acad Sci U S A. 2007 Jul 10;104(28):11844-9. doi: 10.1073/pnas.0704258104. Epub 2007 Jul 5.
Analysis of the sequences of 74 randomly selected BACs demonstrated that the maize nuclear genome contains approximately 37,000 candidate genes with homologues in other plant species. An additional approximately 5,500 predicted genes are severely truncated and probably pseudogenes. The distribution of genes is uneven, with approximately 30% of BACs containing no genes. BAC gene density varies from 0 to 7.9 per 100 kb, whereas most gene islands contain only one gene. The average number of genes per gene island is 1.7. Only 72% of these genes show collinearity with the rice genome. Particular LTR retrotransposon families (e.g., Gyma) are enriched on gene-free BACs, most of which do not come from pericentromeres or other large heterochromatic regions. Gene-containing BACs are relatively enriched in different families of LTR retrotransposons (e.g., Ji). Two major bursts of LTR retrotransposon activity in the last 2 million years are responsible for the large size of the maize genome, but only the more recent of these is well represented in gene-containing BACs, suggesting that LTR retrotransposons are more efficiently removed in these domains. The results demonstrate that sample sequencing and careful annotation of a few randomly selected BACs can provide a robust description of a complex plant genome.
对74个随机挑选的细菌人工染色体(BAC)的序列分析表明,玉米核基因组含有约37000个在其他植物物种中有同源物的候选基因。另外约5500个预测基因严重截短,可能是假基因。基因分布不均,约30%的BAC不含基因。BAC的基因密度在每100千碱基中从0到7.9不等,而大多数基因岛仅包含一个基因。每个基因岛的平均基因数为1.7。这些基因中只有72%与水稻基因组具有共线性。特定的长末端重复序列(LTR)反转录转座子家族(如Gyma)在无基因的BAC上富集,其中大多数并非来自着丝粒周围区域或其他大的异染色质区域。含基因的BAC在不同的LTR反转录转座子家族(如Ji)中相对富集。过去200万年中LTR反转录转座子的两次主要活跃爆发导致了玉米基因组的庞大,但只有最近一次在含基因的BAC中有很好的体现,这表明LTR反转录转座子在这些区域被更有效地清除。结果表明,对一些随机挑选的BAC进行样本测序和仔细注释可以为复杂的植物基因组提供可靠的描述。