Brown University, Department of Ecology and Evolutionary Biology, Box G-W, 80 Waterman St, Providence, Rhode Island, 02912, USA.
Yale University, Department of Ecology and Evolutionary Biology, 165 Prospect Street, New Haven, Connecticut, 06511, USA.
Am J Bot. 2018 Mar;105(3):602-613. doi: 10.1002/ajb2.1051. Epub 2018 Apr 16.
Next-generation sequencing facilitates rapid production of well-sampled phylogenies built from very large genetic data sets, which can then be subsequently exploited to examine the molecular evolution of the genes themselves. We present an evolutionary analysis of 83 gene families (19 containing carbon-concentrating mechanism (CCM) genes, 64 containing non-CCM genes) in the portullugo clade (Caryophyllales), a diverse lineage of mostly arid-adapted plants that contains multiple evolutionary origins of all known photosynthesis types in land plants (C , C , CAM, C -CAM, and various intermediates).
We inferred a phylogeny of 197 individuals from 167 taxa using coalescent-based approaches and individual gene family trees using maximum likelihood. Positive selection analyses were conducted on individual gene family trees with a mixed effects model of evolution (MEME). We devised new indices to compare levels of convergence and prevalence of particular residues between CCM and non-CCM genes and between species with different photosynthetic pathways.
Contrary to expectations, there were no significant differences in the levels of positive selection detected in CCM versus non-CCM genes. However, we documented a significantly higher level of convergent amino acid substitutions in CCM genes, especially in C taxa.
Our analyses reveal a new suite of amino acid residues putatively important for C and CAM function. We discuss both the advantages and challenges of using targeted enrichment sequence data for exploratory studies of molecular evolution.
下一代测序技术能够快速生成从大型遗传数据集构建的样本充足的系统发育树,然后可以利用这些系统发育树来研究基因本身的分子进化。我们对 portullugo 进化枝(石竹目)中的 83 个基因家族(19 个包含碳浓缩机制(CCM)基因,64 个包含非 CCM 基因)进行了进化分析,portullugo 进化枝是一个多样化的植物进化枝,主要适应干旱环境,包含陆地植物中所有已知光合作用类型(C 3 、C 4 、CAM、C 3 -CAM 和各种中间类型)的多种进化起源。
我们使用基于合并的方法推断了来自 167 个分类群的 197 个个体的系统发育,使用最大似然法推断了各个基因家族树。我们在个体基因家族树上使用混合效应模型进化(MEME)进行了正选择分析。我们设计了新的指数来比较 CCM 和非 CCM 基因之间以及具有不同光合作用途径的物种之间特定残基的收敛程度和流行程度。
与预期相反,在 CCM 与非 CCM 基因之间检测到的正选择水平没有显著差异。然而,我们记录了 CCM 基因中趋同氨基酸取代的水平显著升高,特别是在 C 类群中。
我们的分析揭示了一套新的假定对 C 和 CAM 功能重要的氨基酸残基。我们讨论了使用靶向富集序列数据进行分子进化探索性研究的优点和挑战。