Suppr超能文献

系统发育转录组学:饱和的第三密码子位置对基于下一代数据的系统发育树估计有根本性影响。

Phylotranscriptomics: saturated third codon positions radically influence the estimation of trees based on next-gen data.

作者信息

Breinholt Jesse W, Kawahara Akito Y

机构信息

Florida Museum of Natural History, University of Florida.

出版信息

Genome Biol Evol. 2013;5(11):2082-92. doi: 10.1093/gbe/evt157.

Abstract

Recent advancements in molecular sequencing techniques have led to a surge in the number of phylogenetic studies that incorporate large amounts of genetic data. We test the assumption that analyzing large number of genes will lead to improvements in tree resolution and branch support using moths in the superfamily Bombycoidea, a group with some interfamilial relationships that have been difficult to resolve. Specifically, we use a next-gen data set that included 19 taxa and 938 genes (∼1.2M bp) to examine how codon position and saturation might influence resolution and node support among three key families. Maximum likelihood, parsimony, and species tree analysis using gene tree parsimony, on different nucleotide and amino acid data sets, resulted in largely congruent topologies with high bootstrap support compared with prior studies that included fewer loci. However, for a few shallow nodes, nucleotide and amino acid data provided high support for conflicting relationships. The third codon position was saturated and phylogenetic analysis of this position alone supported a completely different, potentially misleading sister group relationship. We used the program RADICAL to assess the number of genes needed to fix some of these difficult nodes. One such node originally needed a total of 850 genes but only required 250 when synonymous signal was removed. Our study shows that, in order to effectively use next-gen data to correctly resolve difficult phylogenetic relationships, it is necessary to assess the effects of synonymous substitutions and third codon positions.

摘要

分子测序技术的最新进展导致纳入大量遗传数据的系统发育研究数量激增。我们以蚕蛾总科的蛾类为研究对象,检验分析大量基因是否会提高树的分辨率和分支支持率这一假设,该类群存在一些难以解决的科间关系。具体而言,我们使用一个包含19个分类单元和938个基因(约120万碱基对)的新一代数据集,来研究密码子位置和饱和度如何影响三个关键科之间的分辨率和节点支持率。在不同的核苷酸和氨基酸数据集上,使用基因树简约法进行最大似然法、简约法和物种树分析,与之前包含较少位点的研究相比,得到了在很大程度上一致的拓扑结构,且自展支持率很高。然而,对于一些较浅的节点,核苷酸和氨基酸数据为相互冲突的关系提供了很高的支持。第三密码子位置已饱和,仅对该位置进行系统发育分析支持了一种完全不同的、可能会产生误导的姐妹群关系。我们使用RADICAL程序来评估确定其中一些难以解决的节点所需的基因数量。其中一个这样的节点最初总共需要850个基因,但去除同义信号后仅需250个。我们的研究表明,为了有效利用新一代数据正确解决困难的系统发育关系,有必要评估同义替换和第三密码子位置的影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b9c/3845638/b3d5e3dd55e7/evt157f1p.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验