详细分析玉米基因组的一个连续 22Mb 区域。

Detailed analysis of a contiguous 22-Mb region of the maize genome.

机构信息

Arizona Genomics Institute, School of Plant Sciences and Department of Ecology, BIO5 Institute for Collaborative Research, University of Arizona, Tucson, Arizona, United States of America.

出版信息

PLoS Genet. 2009 Nov;5(11):e1000728. doi: 10.1371/journal.pgen.1000728. Epub 2009 Nov 20.

Abstract

Most of our understanding of plant genome structure and evolution has come from the careful annotation of small (e.g., 100 kb) sequenced genomic regions or from automated annotation of complete genome sequences. Here, we sequenced and carefully annotated a contiguous 22 Mb region of maize chromosome 4 using an improved pseudomolecule for annotation. The sequence segment was comprehensively ordered, oriented, and confirmed using the maize optical map. Nearly 84% of the sequence is composed of transposable elements (TEs) that are mostly nested within each other, of which most families are low-copy. We identified 544 gene models using multiple levels of evidence, as well as five miRNA genes. Gene fragments, many captured by TEs, are prevalent within this region. Elimination of gene redundancy from a tetraploid maize ancestor that originated a few million years ago is responsible in this region for most disruptions of synteny with sorghum and rice. Consistent with other sub-genomic analyses in maize, small RNA mapping showed that many small RNAs match TEs and that most TEs match small RNAs. These results, performed on approximately 1% of the maize genome, demonstrate the feasibility of refining the B73 RefGen_v1 genome assembly by incorporating optical map, high-resolution genetic map, and comparative genomic data sets. Such improvements, along with those of gene and repeat annotation, will serve to promote future functional genomic and phylogenomic research in maize and other grasses.

摘要

我们对植物基因组结构和进化的大部分了解来自于对小(例如,100 kb)测序基因组区域的仔细注释,或者来自于完整基因组序列的自动注释。在这里,我们使用改进的拟南芥伪基因组进行注释,对玉米第 4 号染色体的 22 Mb 连续区域进行了测序和仔细注释。使用玉米光学图谱对序列片段进行了全面的排序、定向和确认。近 84%的序列由转座元件(TEs)组成,这些元件大多嵌套在一起,其中大多数家族是低拷贝的。我们使用多种证据识别了 544 个基因模型,以及 5 个 miRNA 基因。该区域内普遍存在基因片段,其中许多是由 TEs 捕获的。从几百万年前起源的四倍体玉米祖先中消除基因冗余是导致该区域与高粱和水稻基因同线性破坏的主要原因。与玉米其他亚基因组分析一致,小 RNA 图谱表明许多小 RNA 与 TEs 匹配,大多数 TEs 与小 RNA 匹配。这些结果是在玉米基因组的大约 1%上进行的,证明了通过整合光学图谱、高分辨率遗传图谱和比较基因组数据集来细化 B73 RefGen_v1 基因组组装的可行性。这些改进,以及基因和重复注释的改进,将有助于促进玉米和其他禾本科植物的未来功能基因组和系统发育基因组学研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/23d1/2773423/51ce4dbf5b18/pgen.1000728.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索