Jiangsu Engineering Research Center for Taxodium Rich, Germplasm Innovation and Propagation, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences (Nanjing Botanical Garden Mem, Sun Yat-Sen), Nanjing, China.
Biodata Biotechnology Co. Ltd, Hefei, China.
BMC Genomics. 2020 Jan 31;21(1):114. doi: 10.1186/s12864-020-6532-1.
Chloroplast (cp) genome information would facilitate the development and utilization of Taxodium resources. However, cp genome characteristics of Taxodium were poorly understood.
We determined the complete cp genome sequences of T. distichum, T. mucronatum, and T. ascendens. The cp genomes are 131,947 bp to 132,613 bp in length, encode 120 genes with the same order, and lack typical inverted repeat (IR) regions. The longest small IR, a 282 bp trnQ-containing IR, were involved in the formation of isomers. Comparative analysis of the 3 cp genomes showed that 91.57% of the indels resulted in the periodic variation of tandem repeat (TR) motifs and 72.46% single nucleotide polymorphisms (SNPs) located closely to TRs, suggesting a relationship between TRs and mutational dynamics. Eleven hypervariable regions were identified as candidates for DNA barcode development. Hypothetical cp open reading frame 1(Ycf1) was the only one gene that has an indel in coding DNA sequence, and the indel is composed of a long TR. When extended to cupressophytes, ycf1 genes have undergone a universal insertion of TRs accompanied by extreme length expansion. Meanwhile, ycf1 also located in rearrangement endpoints of cupressophyte cp genomes. All these characteristics highlight the important role of repeats in the evolution of cp genomes.
This study added new evidence for the role of repeats in the dynamics mechanism of cp genome mutation and rearrangement. Moreover, the information of TRs and hypervariable regions would provide reliable molecular resources for future research focusing on the infrageneric taxa identification, phylogenetic resolution, population structure and biodiversity for the genus Taxodium and Cupressophytes.
叶绿体(cp)基因组信息将有助于开发和利用池杉资源。然而,池杉 cp 基因组的特征还不太清楚。
我们测定了落羽杉、墨西哥落羽杉和池杉的完整 cp 基因组序列。cp 基因组长度为 131947bp 至 132613bp,编码 120 个基因,具有相同的顺序,缺乏典型的反向重复(IR)区。最长的小 IR 是一个包含 trnQ 的 282bp 的 ir,参与了异构体的形成。对 3 个 cp 基因组的比较分析表明,91.57%的插入缺失导致串联重复(TR)基序的周期性变化,72.46%的单核苷酸多态性(SNP)紧密位于 TRs 附近,提示 TRs 与突变动态之间存在关系。确定了 11 个超可变区作为 DNA 条形码开发的候选区。假设的 cp 开放阅读框 1(ycf1)是唯一一个在编码 DNA 序列中有插入缺失的基因,该插入缺失由一个长 TR 组成。当扩展到柏科植物时,ycf1 基因经历了 TR 的普遍插入和极端长度扩展。同时,ycf1 也位于柏科植物 cp 基因组重排的端点。所有这些特征都突出了重复在 cp 基因组突变和重排动力学机制中的重要作用。
本研究为重复在 cp 基因组突变和重排动力学机制中的作用提供了新的证据。此外,TRs 和高可变区的信息将为未来研究提供可靠的分子资源,这些研究集中在种内分类群的鉴定、系统发育分辨率、种群结构和生物多样性方面,对象为落羽杉属和柏科植物。