Kim Young-Kee, Jo Sangjin, Cheon Se-Hwan, Joo Min-Jung, Hong Ja-Ram, Kwak Myounghai, Kim Ki-Joong
Division of Life Sciences, Korea University, Seoul, South Korea.
Department of Plant Resources, National Institute of Biological Resources, Incheon, South Korea.
Front Plant Sci. 2020 Feb 21;11:22. doi: 10.3389/fpls.2020.00022. eCollection 2020.
In order to understand the evolution of the orchid plastome, we annotated and compared 124 complete plastomes of Orchidaceae representing all the major lineages in their structures, gene contents, gene rearrangements, and IR contractions/expansions. Forty-two of these plastomes were generated from the corresponding author's laboratory, and 24 plastomes-including nine genera (, , , , , , , and )-are new in this study. All orchid plastomes, except and have a quadripartite structure consisting of a large single copy (LSC), two inverted repeats (IRs), and a small single copy (SSC) region. The IR region was completely lost in the plastomes. The SSC is lost in the plastome. The smallest plastome size was 19,047 bp, in and the largest plastome size was 178,131 bp, in . The small plastome sizes are primarily the result of gene losses associated with mycoheterotrophic habitats, while the large plastome sizes are due to the expansion of noncoding regions. The minimal number of common genes among orchid plastomes to maintain minimal plastome activity was 15, including the three subunits of (14, 16, and 36), seven subunits of (2, 3, 4, 7, 8, 11, and 14), three subunits of (5, 16, and 23), C-GCA, and P genes. Three stages of gene loss were observed among the orchid plastomes. The first was gene loss, which is widespread in Apostasioideae, Vanilloideae, Cypripedioideae, and Epidendroideae, but rare in the Orchidoideae. The second stage was the loss of photosynthetic genes (, and ) and gene subunits, which are restricted to and some species of and . The third stage was gene loss related to prokaryotic gene expression (, , and others), which was observed in , , and In addition, an intermediate stage between the second and third stage was observed in (Vanilloideae). The majority of intron losses are associated with the loss of their corresponding genes. In some orchid taxa, however, introns have been lost in 1616, and P(2) without their corresponding gene being lost. A total of 104 gene rearrangements were counted when comparing 116 orchid plastomes. Among them, many were concentrated near the IRa/b-SSC junction area. The plastome phylogeny of 124 orchid species confirmed the relationship of {Apostasioideae [Vanilloideae (Cypripedioideae (Orchidoideae, Epidendroideae))]} at the subfamily level and the phylogenetic relationships of 17 tribes were also established. Molecular clock analysis based on the whole plastome sequences suggested that Orchidaceae diverged from its sister family 99.2 mya, and the estimated divergence times of five subfamilies are as follows: Apostasioideae (79.91 mya), Vanilloideae (69.84 mya), Cypripedioideae (64.97 mya), Orchidoideae (59.16 mya), and Epidendroideae (59.16 mya). We also released the first nuclear ribosomal (nr) DNA unit (18S-ITS1-5.8S-ITS2-28S-NTS-ETS) sequences for the 42 species of Orchidaceae. Finally, the phylogenetic tree based on the nrDNA unit sequences is compared to the tree based on the 42 identical plastome sequences, and the differences between the two datasets are discussed in this paper.
为了解兰花质体基因组的进化,我们注释并比较了124个兰科植物的完整质体基因组,这些基因组代表了所有主要谱系,涉及结构、基因内容、基因重排以及反向重复序列(IR)的收缩/扩张。其中42个质体基因组由通讯作者的实验室产生,24个质体基因组(包括9个属:[此处原文缺失属名])是本研究中的新数据。除[此处原文缺失属名]和[此处原文缺失属名]外,所有兰花质体基因组都具有四分体结构,由一个大单拷贝(LSC)、两个反向重复序列(IRs)和一个小单拷贝(SSC)区域组成。在[此处原文缺失属名]质体基因组中,IR区域完全丢失。在[此处原文缺失属名]质体基因组中,SSC区域丢失。最小的质体基因组大小为19,047 bp,存在于[此处原文缺失属名]中,最大的质体基因组大小为178,131 bp,存在于[此处原文缺失属名]中。较小的质体基因组大小主要是与菌异养生境相关的基因丢失的结果,而较大的质体基因组大小则是由于非编码区域的扩张。兰花质体基因组中维持最小质体基因组活性的共同基因的最小数量为15个,包括光合系统I的三个亚基(14、16和36)、光合系统II的七个亚基(2、3、4、7、8、11和14)、细胞色素b6f复合体的三个亚基(5、16和23)、C-GCA以及P基因。在兰花质体基因组中观察到了三个基因丢失阶段。第一阶段是光合系统I基因丢失,这在拟兰亚科、香荚兰亚科、杓兰亚科和树兰亚科中广泛存在,但在红门兰亚科中很少见。第二阶段是光合基因(psbA、psbD和psbM)和细胞色素b6f复合体基因亚基的丢失,这仅限于[此处原文缺失属名]以及[此处原文缺失属名]和[此处原文缺失属名]的一些物种。第三阶段是与原核基因表达相关的基因丢失(rpl22、rps19、rps7等),这在[此处原文缺失属名]、[此处原文缺失属名]、[此处原文缺失属名]和[此处原文缺失属名]中观察到。此外,在香荚兰亚科的[此处原文缺失属名]中观察到了第二阶段和第三阶段之间的一个中间阶段。大多数内含子的丢失与它们相应基因的丢失相关。然而,在一些兰花分类群中,内含子在1616和P(2)中丢失,而它们相应的基因并未丢失。在比较116个兰花质体基因组时,总共统计到104次基因重排。其中,许多重排集中在IRa/b-SSC交界区域附近。124种兰花的质体基因组系统发育在亚科水平上证实了{拟兰亚科[香荚兰亚科(杓兰亚科(红门兰亚科、树兰亚科))]}的关系,并且还建立了17个族的系统发育关系。基于全质体基因组序列的分子钟分析表明,兰科与其姐妹科在9920万年前分化,五个亚科的估计分化时间如下:拟兰亚科(7991万年前)、香荚兰亚科(6984万年前)、杓兰亚科(6497万年前)、红门兰亚科(5916万年前)和树兰亚科(5916万年前)。我们还公布了42种兰科植物的首个核糖体(nr)DNA单元(18S-ITS1-5.8S-ITS2-28S-NTS-ETS)序列。最后,将基于nrDNA单元序列的系统发育树与基于42个相同质体基因组序列的树进行比较,并在本文中讨论了这两个数据集之间的差异。