Chen Zhiwen, Grover Corrinne E, Li Pengbo, Wang Yumei, Nie Hushuai, Zhao Yanpeng, Wang Meiyan, Liu Fang, Zhou Zhongli, Wang Xingxing, Cai Xiaoyan, Wang Kunbo, Wendel Jonathan F, Hua Jinping
Laboratory of Cotton Genetics, Genomics and Breeding, College of Agronomy and Biotechnology/Key Laboratory of Crop Heterosis and Utilization of Ministry of Education/Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China.
Department of Ecology, Evolution and Organismal Biology, Iowa State University, Ames, IA 50011, USA.
Mol Phylogenet Evol. 2017 Jul;112:268-276. doi: 10.1016/j.ympev.2017.04.014. Epub 2017 Apr 13.
Cotton (Gossypium spp.) is commonly grouped into eight diploid genomic groups, designated A-G and K, and one tetraploid genomic group, namely AD. To gain insight into the phylogeny of Gossypium and molecular evolution of the chloroplast genome duringdiversification, chloroplast genomes (cpDNA) from 6 D-genome and 2 G-genome species of Gossypium (G. armourianum D, G. harknessii D, G. davidsonii D, G. klotzschianum D, G. aridum D, G. trilobum D, and G. australe G, G. nelsonii G) were newly reported here. In combination with the 26 previously released cpDNA sequences, we performed comparative phylogenetic analyses of 34 Gossypium chloroplast genomes that collectively represent most of the diversity in the genus. Gossypium chloroplasts span a small range in size that is mostly attributable to indels that occur in the large single copy (LSC) region of the genome. Phylogenetic analysis using a concatenation of all genes provides robust support for six major Gossypium clades, largely supporting earlier inferences but also revealing new information on intrageneric relationships. Using Theobroma cacao as an outgroup, diversification of the genus was dated, yielding results that are in accord with previous estimates of divergence times, but also offering new perspectives on the basal, early radiation of all major clades within the genus as well as gaps in the record indicative of extinctions. Like most higher-plant chloroplast genomes, all cotton species exhibit a conserved quadripartite structure, i.e., two large inverted repeats (IR) containing most of the ribosomal RNA genes, and two unique regions, LSC (large single sequence) and SSC (small single sequence). Within Gossypium, the IR-single copy region junctions are both variable and homoplasious among species. Two genes, accD and psaJ, exhibited greater rates of synonymous and non-synonymous substitutions than did other genes. Most genes exhibited Ka/Ks ratios suggestive of neutral evolution, with 8 exceptions distributed among one to several species. This research provides an overview of the molecular evolution of a single, large non-recombining molecular during the diversification of this important genus.
棉花(棉属物种)通常被分为八个二倍体基因组组,命名为A - G和K,以及一个四倍体基因组组,即AD。为了深入了解棉属的系统发育以及叶绿体基因组在多样化过程中的分子进化,本文新报道了来自棉属6个D基因组和2个G基因组物种(阿氏棉D、哈氏棉D、戴维逊氏棉D、克洛茨氏棉D、干旱棉D、三叶棉D,以及南方棉G、尼尔森氏棉G)的叶绿体基因组(cpDNA)。结合之前发布的26个cpDNA序列,我们对34个棉属叶绿体基因组进行了比较系统发育分析,这些基因组共同代表了该属的大部分多样性。棉属叶绿体的大小范围较小,这主要归因于基因组大单拷贝(LSC)区域出现的插入缺失。使用所有基因串联进行的系统发育分析为棉属的六个主要分支提供了有力支持,在很大程度上支持了早期的推断,但也揭示了关于属内关系的新信息。以可可树作为外类群对该属的多样化进行了年代测定,得出的结果与之前对分歧时间的估计一致,但也为该属内所有主要分支的基部、早期辐射以及表明灭绝的记录空白提供了新的视角。与大多数高等植物叶绿体基因组一样,所有棉花物种都呈现出保守的四分体结构,即两个包含大多数核糖体RNA基因并由两个独特区域隔开的大反向重复序列(IR),这两个独特区域分别是大单拷贝序列(LSC)和小单拷贝序列(SSC)。在棉属内,IR - 单拷贝区域连接点在物种间既有变异性又有同塑性。accD和psaJ这两个基因的同义替换和非同义替换率高于其他基因。大多数基因的Ka/Ks比值表明其处于中性进化状态,有8个例外分布在一到几个物种中。这项研究概述了这个重要属在多样化过程中一个单一、大型非重组分子的分子进化情况。