Breinholt Jesse W, Kawahara Akito Y
Florida Museum of Natural History, University of Florida.
Genome Biol Evol. 2013;5(11):2082-92. doi: 10.1093/gbe/evt157.
Recent advancements in molecular sequencing techniques have led to a surge in the number of phylogenetic studies that incorporate large amounts of genetic data. We test the assumption that analyzing large number of genes will lead to improvements in tree resolution and branch support using moths in the superfamily Bombycoidea, a group with some interfamilial relationships that have been difficult to resolve. Specifically, we use a next-gen data set that included 19 taxa and 938 genes (∼1.2M bp) to examine how codon position and saturation might influence resolution and node support among three key families. Maximum likelihood, parsimony, and species tree analysis using gene tree parsimony, on different nucleotide and amino acid data sets, resulted in largely congruent topologies with high bootstrap support compared with prior studies that included fewer loci. However, for a few shallow nodes, nucleotide and amino acid data provided high support for conflicting relationships. The third codon position was saturated and phylogenetic analysis of this position alone supported a completely different, potentially misleading sister group relationship. We used the program RADICAL to assess the number of genes needed to fix some of these difficult nodes. One such node originally needed a total of 850 genes but only required 250 when synonymous signal was removed. Our study shows that, in order to effectively use next-gen data to correctly resolve difficult phylogenetic relationships, it is necessary to assess the effects of synonymous substitutions and third codon positions.
分子测序技术的最新进展导致纳入大量遗传数据的系统发育研究数量激增。我们以蚕蛾总科的蛾类为研究对象,检验分析大量基因是否会提高树的分辨率和分支支持率这一假设,该类群存在一些难以解决的科间关系。具体而言,我们使用一个包含19个分类单元和938个基因(约120万碱基对)的新一代数据集,来研究密码子位置和饱和度如何影响三个关键科之间的分辨率和节点支持率。在不同的核苷酸和氨基酸数据集上,使用基因树简约法进行最大似然法、简约法和物种树分析,与之前包含较少位点的研究相比,得到了在很大程度上一致的拓扑结构,且自展支持率很高。然而,对于一些较浅的节点,核苷酸和氨基酸数据为相互冲突的关系提供了很高的支持。第三密码子位置已饱和,仅对该位置进行系统发育分析支持了一种完全不同的、可能会产生误导的姐妹群关系。我们使用RADICAL程序来评估确定其中一些难以解决的节点所需的基因数量。其中一个这样的节点最初总共需要850个基因,但去除同义信号后仅需250个。我们的研究表明,为了有效利用新一代数据正确解决困难的系统发育关系,有必要评估同义替换和第三密码子位置的影响。