Fu Yan, Emrich Scott J, Guo Ling, Wen Tsui-Jung, Ashlock Daniel A, Aluru Srinivas, Schnable Patrick S
Interdepartmental Genetics Graduate Program, L. H. Baker Center for Bioinformatics and Biological Statistics, Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA 50011, USA.
Proc Natl Acad Sci U S A. 2005 Aug 23;102(34):12282-7. doi: 10.1073/pnas.0503394102. Epub 2005 Aug 15.
Recent sequencing efforts have targeted the gene-rich regions of the maize (Zea mays L.) genome. We report the release of an improved assembly of maize assembled genomic islands (MAGIs). The 114,173 resulting contigs have been subjected to computational and physical quality assessments. Comparisons to the sequences of maize bacterial artificial chromosomes suggest that at least 97% (160 of 165) of MAGIs are correctly assembled. Because the rates at which junction-testing PCR primers for genomic survey sequences (90-92%) amplify genomic DNA are not significantly different from those of control primers ( approximately 91%), we conclude that a very high percentage of genic MAGIs accurately reflect the structure of the maize genome. EST alignments, ab initio gene prediction, and sequence similarity searches of the MAGIs are available at the Iowa State University MAGI web site. This assembly contains 46,688 ab initio predicted genes. The expression of almost half (628 of 1,369) of a sample of the predicted genes that lack expression evidence was validated by RT-PCR. Our analyses suggest that the maize genome contains between approximately 33,000 and approximately 54,000 expressed genes. Approximately 5% (32 of 628) of the maize transcripts discovered do not have detectable paralogs among maize ESTs or detectable homologs from other species in the GenBank NR nucleotide/protein database. Analyses therefore suggest that this assembly of the maize genome contains approximately 350 previously uncharacterized expressed genes. We hypothesize that these "orphans" evolved quickly during maize evolution and/or domestication.
近期的测序工作聚焦于玉米(Zea mays L.)基因组中富含基因的区域。我们报告了一种经过改进的玉米组装基因组岛(MAGIs)组装结果。所得到的114,173个重叠群已经过计算和物理质量评估。与玉米细菌人工染色体序列的比较表明,至少97%(165个中的160个)的MAGIs被正确组装。由于用于基因组调查序列的连接测试PCR引物(90 - 92%)扩增基因组DNA的速率与对照引物(约91%)的速率没有显著差异,我们得出结论,非常高比例的基因MAGIs准确反映了玉米基因组的结构。MAGIs的EST比对、从头基因预测和序列相似性搜索可在爱荷华州立大学MAGI网站上获取。这个组装包含46,688个从头预测基因。通过RT-PCR验证了几乎一半(1369个中的628个)缺乏表达证据的预测基因样本的表达。我们的分析表明,玉米基因组包含大约33,000至大约54,000个表达基因。在发现的玉米转录本中,约5%(628个中的32个)在玉米EST中没有可检测到的旁系同源物,在GenBank NR核苷酸/蛋白质数据库中也没有来自其他物种的可检测到的同源物。因此分析表明,这个玉米基因组组装包含大约350个以前未表征的表达基因。我们假设这些“孤儿”基因在玉米进化和/或驯化过程中快速进化。