Haberer Georg, Young Sarah, Bharti Arvind K, Gundlach Heidrun, Raymond Christina, Fuks Galina, Butler Ed, Wing Rod A, Rounsley Steve, Birren Bruce, Nusbaum Chad, Mayer Klaus F X, Messing Joachim
Munich Information Center for Protein Sequences, Institute for Bioinformatics, Gesellschaft für Strahlenforschung Research Center for Environment and Health, D-85764 Neuherberg, Germany.
Plant Physiol. 2005 Dec;139(4):1612-24. doi: 10.1104/pp.105.068718.
Maize (Zea mays or corn) plays many varied and important roles in society. It is not only an important experimental model plant, but also a major livestock feed crop and a significant source of industrial products such as sweeteners and ethanol. In this study we report the systematic analysis of contiguous sequences of the maize genome. We selected 100 random regions averaging 144 kb in size, representing about 0.6% of the genome, and generated a high-quality dataset for sequence analysis. This sampling contains 330 annotated genes, 91% of which are supported by expressed sequence tag data from maize and other cereal species. Genes averaged 4 kb in size with five exons, although the largest was over 59 kb with 31 exons. Gene density varied over a wide range from 0.5 to 10.7 genes per 100 kb and genes did not appear to cluster significantly. The total repetitive element content we observed (66%) was slightly higher than previous whole-genome estimates (58%-63%) and consisted almost exclusively of retroelements. The vast majority of genes can be aligned to at least one sequence read derived from gene-enrichment procedures, but only about 30% are fully covered. Our results indicate that much of the increase in genome size of maize relative to rice (Oryza sativa) and Arabidopsis (Arabidopsis thaliana) is attributable to an increase in number of both repetitive elements and genes.
玉米(Zea mays 或corn)在社会中扮演着多种重要角色。它不仅是一种重要的实验模式植物,还是主要的家畜饲料作物,以及甜味剂和乙醇等工业产品的重要来源。在本研究中,我们报告了对玉米基因组连续序列的系统分析。我们随机选择了100个平均大小为144 kb的区域,约占基因组的0.6%,并生成了用于序列分析的高质量数据集。该样本包含330个注释基因,其中91%得到了来自玉米和其他谷类物种的表达序列标签数据的支持。基因平均大小为4 kb,有五个外显子,尽管最大的基因超过59 kb,有31个外显子。基因密度在每100 kb 0.5至10.7个基因的广泛范围内变化,且基因似乎没有明显的聚集现象。我们观察到的总重复元件含量(66%)略高于先前的全基因组估计值(58%-63%),且几乎全部由反转录元件组成。绝大多数基因可以与至少一个来自基因富集程序的序列读数比对,但只有约30%被完全覆盖。我们的结果表明,相对于水稻(Oryza sativa)和拟南芥(Arabidopsis thaliana),玉米基因组大小的增加很大程度上归因于重复元件和基因数量的增加。