School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China.
National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology & Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.
BMC Genomics. 2020 Jan 28;21(1):76. doi: 10.1186/s12864-020-6499-y.
Duckweeds (Lemnaceae) are aquatic plants distributed all over the world. The chloroplast genome, as an efficient solar-powered reactor, is an invaluable resource to study biodiversity and to carry foreign genes. The chloroplast genome sequencing has become routine and less expensive with the delivery of high-throughput sequencing technologies, allowing us to deeply investigate genomics and transcriptomics of duckweed organelles.
Here, the complete chloroplast genome of Spirodela polyrhiza 7498 (SpV2) is assembled by PacBio sequencing. The length of 168,956 bp circular genome is composed of a pair of inverted repeats of 31,844 bp, a large single copy of 91,210 bp and a small single copy of 14,058 bp. Compared to the previous version (SpV1) assembled from short reads, the integrity and quality of SpV2 are improved, especially with the retrieval of two repeated fragments in ycf2 gene. There are a number of 107 unique genes, including 78 protein-coding genes, 25 tRNA genes and 4 rRNA genes. With the evidence of full-length cDNAs generated from PacBio isoform sequencing, seven genes (ycf3, clpP, atpF, rpoC1, rpl2, rps12 and ndhA) are detected to contain type-II introns. The ndhA intron has 50% more sequence divergence than the species-barcoding marker of atpF-atpH, showing the potential power to discriminate close species. A number of 37 RNA editing sites are recognized to have cytosine (C) to uracil (U) substitutions, eight of which are newly defined including six from the intergenic regions and two from the coding sequences of rpoC2 and ndhA genes. In addition, nine operon classes are identified using transcriptomic data. It is found that the operons contain multiple subunit genes encoding the same functional complexes comprising of ATP synthase, photosynthesis system, ribosomal proteins, et.al., which could be simultaneously transcribed and coordinately translated in response to the cell stimuli.
The understanding of the chloroplast genomics and the transcriptomics of S.polyrhiza would greatly facilitate the study of phylogenetic evolution and the application of genetically engineering duckweeds.
浮萍(浮萍科)是一种分布于世界各地的水生植物。叶绿体基因组作为一种高效的太阳能反应堆,是研究生物多样性和携带外源基因的宝贵资源。随着高通量测序技术的发展,叶绿体基因组测序已成为常规且成本更低,使我们能够深入研究浮萍细胞器的基因组学和转录组学。
本研究通过 PacBio 测序组装了 Spirodela polyrhiza 7498(SpV2)的完整叶绿体基因组。该 168956bp 大小的环状基因组由一对 31844bp 的反向重复序列、一个 91210bp 的大单拷贝序列和一个 14058bp 的小单拷贝序列组成。与之前基于短读测序组装的版本(SpV1)相比,SpV2 的完整性和质量得到了提高,特别是在 ycf2 基因中检索到两个重复片段。有 107 个独特的基因,包括 78 个蛋白编码基因、25 个 tRNA 基因和 4 个 rRNA 基因。通过 PacBio 异构体测序生成全长 cDNA 的证据,检测到 7 个基因(ycf3、clpP、atpF、rpoC1、rpl2、rps12 和 ndhA)含有 II 型内含子。ndhA 内含子的序列分化比 atpF-atpH 物种编码标记多 50%,显示出区分近缘种的潜力。共识别到 37 个 RNA 编辑位点发生胞嘧啶(C)到尿嘧啶(U)的取代,其中 8 个为新定义的编辑位点,包括 6 个来自基因间区和 2 个来自 rpoC2 和 ndhA 基因的编码序列。此外,利用转录组数据鉴定了 9 个操纵子类。研究发现,这些操纵子包含多个亚基基因,编码相同功能的复合物,包括 ATP 合酶、光合作用系统、核糖体蛋白等,可以同时转录并协调翻译,以响应细胞刺激。
对 S.polyrhiza 叶绿体基因组和转录组的了解将极大地促进对系统进化和遗传工程浮萍应用的研究。