Guangdong Key Lab of Ornamental Plant Germplasm Innovation and Utilization, Environmental Horticulture Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou, 510640, China.
BMC Genomics. 2024 Jan 17;25(1):68. doi: 10.1186/s12864-024-09996-4.
Costaceae, commonly known as the spiral ginger family, consists of approximately 120 species distributed in the tropical regions of South America, Africa, and Southeast Asia, of which some species have important ornamental, medicinal and ecological values. Previous studies on the phylogenetic and taxonomic of Costaceae by using nuclear internal transcribed spacer (ITS) and chloroplast genome fragments data had low resolutions. Additionally, the structures, variations and molecular evolution of complete chloroplast genomes in Costaceae still remain unclear. Herein, a total of 13 complete chloroplast genomes of Costaceae including 8 newly sequenced and 5 from the NCBI GenBank database, representing all three distribution regions of this family, were comprehensively analyzed for comparative genomics and phylogenetic relationships.
The 13 complete chloroplast genomes of Costaceae possessed typical quadripartite structures with lengths from 166,360 to 168,966 bp, comprising a large single copy (LSC, 90,802 - 92,189 bp), a small single copy (SSC, 18,363 - 20,124 bp) and a pair of inverted repeats (IRs, 27,982 - 29,203 bp). These genomes coded 111 - 113 different genes, including 79 protein-coding genes, 4 rRNA genes and 28 - 30 tRNAs genes. The gene orders, gene contents, amino acid frequencies and codon usage within Costaceae were highly conservative, but several variations in intron loss, long repeats, simple sequence repeats (SSRs) and gene expansion on the IR/SC boundaries were also found among these 13 genomes. Comparative genomics within Costaceae identified five highly divergent regions including ndhF, ycf1-D2, ccsA-ndhD, rps15-ycf1-D2 and rpl16-exon2-rpl16-exon1. Five combined DNA regions (ycf1-D2 + ndhF, ccsA-ndhD + rps15-ycf1-D2, rps15-ycf1-D2 + rpl16-exon2-rpl16-exon1, ccsA-ndhD + rpl16-exon2-rpl16-exon1, and ccsA-ndhD + rps15-ycf1-D2 + rpl16-exon2-rpl16-exon1) could be used as potential markers for future phylogenetic analyses and species identification in Costaceae. Positive selection was found in eight protein-coding genes, including cemA, clpP, ndhA, ndhF, petB, psbD, rps12 and ycf1. Maximum likelihood and Bayesian phylogenetic trees using chloroplast genome sequences consistently revealed identical tree topologies with high supports between species of Costaceae. Three clades were divided within Costaceae, including the Asian clade, Costus clade and South American clade. Tapeinochilos was a sister of Hellenia, and Parahellenia was a sister to the cluster of Tapeinochilos + Hellenia with strong support in the Asian clade. The results of molecular dating showed that the crown age of Costaceae was about 30.5 Mya (95% HPD: 14.9 - 49.3 Mya), and then started to diverge into the Costus clade and Asian clade around 23.8 Mya (95% HPD: 10.1 - 41.5 Mya). The Asian clade diverged into Hellenia and Parahellenia at approximately 10.7 Mya (95% HPD: 3.5 - 25.1 Mya).
The complete chloroplast genomes can resolve the phylogenetic relationships of Costaceae and provide new insights into genome structures, variations and evolution. The identified DNA divergent regions would be useful for species identification and phylogenetic inference in Costaceae.
姜科,俗称螺旋姜科,由大约 120 种分布在南美、非洲和东南亚热带地区的物种组成,其中一些物种具有重要的观赏、药用和生态价值。以前使用核内部转录间隔区(ITS)和叶绿体基因组片段数据对姜科的系统发育和分类学进行的研究分辨率较低。此外,姜科完整叶绿体基因组的结构、变异和分子进化仍不清楚。本研究综合分析了包括 8 个新测序和 5 个来自 NCBI GenBank 数据库的 13 个姜科完整叶绿体基因组,代表了该科的三个分布区域,以进行比较基因组学和系统发育关系研究。
姜科的 13 个完整叶绿体基因组具有典型的四分体结构,长度为 166360-168966bp,包括一个大的单拷贝(LSC,90802-92189bp)、一个小的单拷贝(SSC,18363-20124bp)和一对反向重复(IRs,27982-29203bp)。这些基因组编码了 111-113 个不同的基因,包括 79 个蛋白编码基因、4 个 rRNA 基因和 28-30 个 tRNA 基因。姜科内的基因排列、基因含量、氨基酸频率和密码子使用高度保守,但在 13 个基因组中也发现了一些内含子缺失、长重复、简单重复序列(SSRs)和基因扩展的变化,发生在 IR/SC 边界处。姜科内的比较基因组学确定了五个高度分化的区域,包括 ndhF、ycf1-D2、ccsA-ndhD、rps15-ycf1-D2 和 rpl16-exon2-rpl16-exon1。五个组合 DNA 区域(ycf1-D2+ndhF、ccsA-ndhD+rps15-ycf1-D2、rps15-ycf1-D2+rpl16-exon2-rpl16-exon1、ccsA-ndhD+rpl16-exon2-rpl16-exon1 和 ccsA-ndhD+rps15-ycf1-D2+rpl16-exon2-rpl16-exon1)可作为未来姜科系统发育分析和物种鉴定的潜在标记。在八个蛋白编码基因中发现了正选择,包括 cemA、clpP、ndhA、ndhF、petB、psbD、rps12 和 ycf1。使用叶绿体基因组序列进行的最大似然法和贝叶斯系统发育树一致地揭示了姜科物种之间高度支持的相同树拓扑结构。姜科分为三个分支,包括亚洲分支、姜属分支和南美分支。Tapeinochilos 是 Hellenia 的姐妹群,而 Parahellenia 是与 Tapeinochilos+Hellenia 聚类的姐妹群,在亚洲分支中得到了强烈的支持。分子钟的结果表明,姜科的冠层年龄约为 30.5 Mya(95%HPD:14.9-49.3 Mya),然后在 23.8 Mya(95%HPD:10.1-41.5 Mya)左右开始分化为姜属分支和亚洲分支。亚洲分支分化为 Hellenia 和 Parahellenia 约在 10.7 Mya(95%HPD:3.5-25.1 Mya)。
完整的叶绿体基因组可以解决姜科的系统发育关系,并提供对基因组结构、变异和进化的新见解。鉴定的 DNA 分化区域将有助于姜科的物种鉴定和系统发育推断。