Plant Biotechnology, Faculty of Biology, University of Freiburg, Schaenzlestr. 1, 79104, Freiburg, Germany.
Plant Genome and Systems Biology, Helmholtz Center Munich, 85764, Neuherberg, Germany.
Plant J. 2018 Feb;93(3):515-533. doi: 10.1111/tpj.13801.
The draft genome of the moss model, Physcomitrella patens, comprised approximately 2000 unordered scaffolds. In order to enable analyses of genome structure and evolution we generated a chromosome-scale genome assembly using genetic linkage as well as (end) sequencing of long DNA fragments. We find that 57% of the genome comprises transposable elements (TEs), some of which may be actively transposing during the life cycle. Unlike in flowering plant genomes, gene- and TE-rich regions show an overall even distribution along the chromosomes. However, the chromosomes are mono-centric with peaks of a class of Copia elements potentially coinciding with centromeres. Gene body methylation is evident in 5.7% of the protein-coding genes, typically coinciding with low GC and low expression. Some giant virus insertions are transcriptionally active and might protect gametes from viral infection via siRNA mediated silencing. Structure-based detection methods show that the genome evolved via two rounds of whole genome duplications (WGDs), apparently common in mosses but not in liverworts and hornworts. Several hundred genes are present in colinear regions conserved since the last common ancestor of plants. These syntenic regions are enriched for functions related to plant-specific cell growth and tissue organization. The P. patens genome lacks the TE-rich pericentromeric and gene-rich distal regions typical for most flowering plant genomes. More non-seed plant genomes are needed to unravel how plant genomes evolve, and to understand whether the P. patens genome structure is typical for mosses or bryophytes.
拟南芥的基因组草案由大约 2000 个未排序的支架组成。为了能够分析基因组结构和进化,我们使用遗传连锁以及长 DNA 片段的(末端)测序生成了染色体规模的基因组组装。我们发现,基因组的 57%由转座元件(TEs)组成,其中一些可能在生命周期中活跃地转座。与开花植物基因组不同,基因和 TE 丰富的区域在染色体上呈均匀分布。然而,染色体是单中心的,一类 Copia 元件的峰值可能与着丝粒重合。在 5.7%的蛋白质编码基因中,基因体甲基化是明显的,通常与低 GC 和低表达相关。一些巨型病毒插入是转录活跃的,可能通过 siRNA 介导的沉默来保护配子免受病毒感染。基于结构的检测方法表明,基因组通过两轮全基因组复制(WGD)进化而来,这在苔藓中很常见,但在石松类植物和角苔类植物中并不常见。有几百个基因存在于自植物最后共同祖先以来保守的共线性区域。这些共线性区域富含与植物特有的细胞生长和组织组织相关的功能。拟南芥基因组缺乏富含转座元件的着丝粒周围区域和富含基因的远端区域,这是大多数开花植物基因组的典型特征。需要更多的非种子植物基因组来揭示植物基因组如何进化,并了解拟南芥基因组结构是否是苔藓植物或苔藓植物的典型特征。