School of Agriculture and Biology, Shanghai Jiao Tong University, 200240 Shanghai, China.
National Key Laboratory of Plant Molecular Genetics, Chinese Academy of Sciences Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology & Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 200032 Shanghai, China.
Proc Natl Acad Sci U S A. 2019 Sep 17;116(38):18893-18899. doi: 10.1073/pnas.1910401116. Epub 2019 Sep 4.
Aquatic plants have to adapt to the environments distinct from where land plants grow. A critical aspect of adaptation is the dynamics of sequence repeats, not resolved in older sequencing platforms due to incomplete and fragmented genome assemblies from short reads. Therefore, we used PacBio long-read sequencing of the genome, reaching a 44-fold increase of contiguity with an N50 (a median of contig lengths) of 831 kb and filling 95.4% of gaps left from the previous version. Reconstruction of repeat regions indicates that sequentially nested long terminal repeat (LTR) retrotranspositions occur early in monocot evolution, featured with both prokaryote-like gene-rich regions and eukaryotic repeat islands. Protein-coding genes are reduced to 18,708 gene models supported by 492,435 high-quality full-length PacBio complementary DNA (cDNA) sequences. Different from land plants, the primitive architecture of 's adventitious roots and lack of lateral roots and root hairs are consistent with dispensable functions of nutrient absorption. Disease-resistant genes encoding antimicrobial peptides and dirigent proteins are expanded by tandem duplications. Remarkably, disease-resistant genes are not only amplified, but also highly expressed, consistent with low levels of 24-nucleotide (nt) small interfering RNA (siRNA) that silence the immune system of land plants, thereby protecting against a wide spectrum of pathogens and pests. The long-read sequence information not only sheds light on plant evolution and adaptation to the environment, but also facilitates applications in bioenergy and phytoremediation.
水生植物必须适应与陆生植物生长环境不同的环境。适应的一个关键方面是序列重复的动态变化,由于来自短读长的不完全和碎片化基因组组装,在较旧的测序平台上无法解决这个问题。因此,我们使用 PacBio 长读长测序了该物种的基因组,与之前的版本相比,连续度提高了 44 倍,N50(序列长度的中位数)达到 831 kb,填补了 95.4%的缺口。重复区域的重建表明,顺序嵌套的长末端重复(LTR)反转录转座发生在单子叶植物进化的早期,其特征是既有原核生物样的富含基因区域,又有真核生物的重复岛。蛋白编码基因减少到 18708 个基因模型,由 492435 个高质量全长 PacBio 互补 DNA(cDNA)序列支持。与陆生植物不同,的不定根原始结构和缺乏侧根和根毛与吸收营养的功能无关。编码抗菌肽和定向蛋白的抗病基因通过串联重复扩增。值得注意的是,抗病基因不仅扩增,而且高度表达,与沉默陆生植物免疫系统的 24 核苷酸(nt)小干扰 RNA(siRNA)水平低一致,从而保护其免受广泛的病原体和害虫的侵害。长读长序列信息不仅揭示了植物的进化和对环境的适应,还为生物能源和植物修复的应用提供了便利。