Molecular and Computational Biology, University of Southern California, Los Angeles, California, USA.
Nat Genet. 2011 May;43(5):476-81. doi: 10.1038/ng.807. Epub 2011 Apr 10.
We report the 207-Mb genome sequence of the North American Arabidopsis lyrata strain MN47 based on 8.3× dideoxy sequence coverage. We predict 32,670 genes in this outcrossing species compared to the 27,025 genes in the selfing species Arabidopsis thaliana. The much smaller 125-Mb genome of A. thaliana, which diverged from A. lyrata 10 million years ago, likely constitutes the derived state for the family. We found evidence for DNA loss from large-scale rearrangements, but most of the difference in genome size can be attributed to hundreds of thousands of small deletions, mostly in noncoding DNA and transposons. Analysis of deletions and insertions still segregating in A. thaliana indicates that the process of DNA loss is ongoing, suggesting pervasive selection for a smaller genome. The high-quality reference genome sequence for A. lyrata will be an important resource for functional, evolutionary and ecological studies in the genus Arabidopsis.
我们报告了基于 8.3×双脱氧测序覆盖度的北美拟南芥 MN47 菌株 207-Mb 基因组序列。与自交种拟南芥(Arabidopsis thaliana)的 27025 个基因相比,我们预测这个异交种中有 32670 个基因。1000 万年前从拟南芥分化而来的拟南芥,其 125-Mb 的基因组要小得多,可能代表了这个家族的衍生状态。我们发现了由于大规模重排导致的 DNA 丢失的证据,但基因组大小的大部分差异可以归因于数十万的小型缺失,主要发生在非编码 DNA 和转座子中。在拟南芥中仍在分离的缺失和插入的分析表明,DNA 丢失的过程仍在继续,这表明对较小基因组的普遍选择。拟南芥属中高质量的拟南芥参考基因组序列将是功能、进化和生态研究的重要资源。