Department of Botany, University of Florida, Gainesville, Florida 32611 USA;
Am J Bot. 2004 Jun;91(6):997-1001. doi: 10.3732/ajb.91.6.997.
The sequence of the plastid genome of Amborella trichopoda, the putative sister to all other extant angiosperms, was recently reported (Molecular Biology and Evolution 20: 1499-1505). Goremykin et al. used sequence data for 61 plastid genes from Amborella and 12 other embryophytes in phylogenetic analyses and concluded that Amborella is not the sister to the remaining flowering plants; the monocots instead occupy this position. The authors attributed their results, which differ substantially from all recent phylogenetic analyses of angiosperms, to the increased character sampling (30 017 nucleotides in their aligned matrix) in their analysis relative to published studies that included fewer genes but more taxa. We hypothesized that the difference in topology is not due to limited character sampling in previous studies but to limited taxon sampling in the analysis by Goremykin et al. To test this, we conducted a series of phylogenetic analyses using a three-gene, 12 (or more)-taxon data set to evaluate the topological effects of (i) including three vs. 61 genes for (nearly) the same set of taxa, (ii) analyzing different codon positions, (iii) substituting representatives of other basal lineages for Amborella, (iv) replacing the grasses used to represent the monocots with other monocots, selected either for their phylogenetic position or randomly, and (v) adding other basal taxa-Nymphaea, Austrobaileya, magnoliids, and monocots-to the 12-taxon data set. Our results demonstrate that the "monocots basal" topology obtained by Goremykin et al. is not due to increased character sampling of the plastid genome; their topology was obtained using only two plastid genes or two plastid genes and one nuclear gene. This topology was also retained when either Nymphaea or Austrobaileya was substituted for Amborella, demonstrating that any of the three basal lineages will attach to Calycanthus for lack of any other close branch. Furthermore, the "monocots basal" topology is not robust to changes in sampling of monocots. Simply adding Oncidium, for example, places Amborella sister to the other angiosperms. Thus, limited taxon sampling, focusing on organisms with complete genome sequences, can lead to artifactual results.
被认为是所有现存开花植物姐妹群的前质体基因组的序列最近已经被报道(《分子生物学与进化》20:1499-1505)。Goremykin 等人使用 Amborella 和 12 种其他胚胎植物的 61 个质体基因的序列数据进行了系统发育分析,他们的结论是 Amborella 并不是其余开花植物的姐妹群;单子叶植物而占据了这个位置。与最近所有开花植物的系统发育分析相比,作者认为他们的结果有很大的不同,这是由于与发表的研究相比,他们的分析中增加了特征取样(他们的比对矩阵中有 30017 个核苷酸),这些研究包括较少的基因但更多的分类群。我们假设拓扑结构的差异不是由于以前的研究中特征取样不足,而是由于 Goremykin 等人的分析中分类群取样不足。为了验证这一点,我们使用三基因、十二(或更多)个分类群数据集进行了一系列系统发育分析,以评估以下因素的拓扑效应:(i)对于几乎相同的分类群,使用 3 个基因与 61 个基因相比;(ii)分析不同的密码子位置;(iii)用其他基部谱系的代表取代 Amborella;(iv)用其他单子叶植物代替所使用的草来代表单子叶植物,选择基于其系统发育位置或随机选择;(v)将其他基部分类群——睡莲、 Austrobaileya、木兰类和单子叶植物——添加到 12 个分类群数据集中。我们的结果表明,Goremykin 等人获得的“单子叶植物基部”拓扑结构并不是由于质体基因组特征取样增加所致;他们的拓扑结构仅使用两个质体基因或两个质体基因和一个核基因获得。当用睡莲或 Austrobaileya 取代 Amborella 时,该拓扑结构仍然保留,这表明由于缺乏其他密切分支,三个基部谱系中的任何一个都将与 Calycanthus 相连。此外,“单子叶植物基部”拓扑结构对单子叶植物取样的变化并不稳健。例如,简单地添加 Oncidium 就可以使 Amborella 与其他被子植物姐妹群。因此,分类群取样有限,集中于具有完整基因组序列的生物,可以导致人为的结果。