Biodiversity Research Center, Academia Sinica, Taipei, Taiwan.
Genome Biol Evol. 2011;3:1284-95. doi: 10.1093/gbe/evr095. Epub 2011 Sep 19.
The relationships among the extant five gymnosperm groups--gnetophytes, Pinaceae, non-Pinaceae conifers (cupressophytes), Ginkgo, and cycads--remain equivocal. To clarify this issue, we sequenced the chloroplast genomes (cpDNAs) from two cupressophytes, Cephalotaxus wilsoniana and Taiwania cryptomerioides, and 53 common chloroplast protein-coding genes from another three cupressophytes, Agathis dammara, Nageia nagi, and Sciadopitys verticillata, and a non-Cycadaceae cycad, Bowenia serrulata. Comparative analyses of 11 conifer cpDNAs revealed that Pinaceae and cupressophytes each lost a different copy of inverted repeats (IRs), which contrasts with the view that the same IR has been lost in all conifers. Based on our structural finding, the character of an IR loss no longer conflicts with the "gnepines" hypothesis (gnetophytes sister to Pinaceae). Chloroplast phylogenomic analyses of amino acid sequences recovered incongruent topologies using different tree-building methods; however, we demonstrated that high heterotachous genes (genes that have highly different rates in different lineages) contributed to the long-branch attraction (LBA) artifact, resulting in incongruence of phylogenomic estimates. Additionally, amino acid compositions appear more heterogeneous in high than low heterotachous genes among the five gymnosperm groups. Removal of high heterotachous genes alleviated the LBA artifact and yielded congruent and robust tree topologies in which gnetophytes and Pinaceae formed a sister clade to cupressophytes (the gnepines hypothesis) and Ginkgo clustered with cycads. Adding more cupressophyte taxa could not improve the accuracy of chloroplast phylogenomics for the five gymnosperm groups. In contrast, removal of high heterotachous genes from data sets is simple and can increase confidence in evaluating the phylogeny of gymnosperms.
现存的五个裸子植物群——买麻藤纲、松科、非松科针叶树(柏科)、银杏和苏铁——之间的关系仍然存在争议。为了解决这个问题,我们对两种柏科植物——翠柏和台湾杉——的叶绿体基因组(cpDNA)进行了测序,并对另外三种柏科植物——粗榧、海南粗榧和翠柏,以及一种非苏铁科苏铁——Bowenia serrulata 的 53 个常见叶绿体蛋白编码基因进行了测序。对 11 种针叶树 cpDNA 的比较分析表明,松科和柏科各自失去了一个不同的反向重复(IR)拷贝,这与所有针叶树都失去了相同的 IR 的观点相反。基于我们的结构发现,IR 缺失的特征不再与“买麻藤纲与松科亲缘关系最近”的假说(gnepines hypothesis)相冲突。基于氨基酸序列的叶绿体系统基因组分析采用不同的建树方法得到了不一致的拓扑结构;然而,我们证明了高度异速基因(在不同谱系中具有高度不同速率的基因)导致了长枝吸引(LBA)假象,从而导致系统基因组估计的不一致。此外,在五个裸子植物群中,高异速基因的氨基酸组成比低异速基因更加多样化。去除高异速基因减轻了 LBA 假象,并产生了一致且稳健的拓扑结构,其中买麻藤纲和松科与柏科形成一个姐妹群(gnepines 假说),而银杏与苏铁聚类。增加更多的柏科分类群并不能提高五个裸子植物群叶绿体系统基因组学的准确性。相比之下,从数据集中去除高异速基因是简单的,可以提高评估裸子植物系统发育的可信度。