Suppr超能文献

从多基因序列数据联合估计网状系统发育树和基因树。

Coestimating Reticulate Phylogenies and Gene Trees from Multilocus Sequence Data.

机构信息

Department of Computer Science.

Department of BioSciences, Rice University, 6100 Main Street, Houston, TX 77005, USA.

出版信息

Syst Biol. 2018 May 1;67(3):439-457. doi: 10.1093/sysbio/syx085.

Abstract

The multispecies network coalescent (MSNC) is a stochastic process that captures how gene trees grow within the branches of a phylogenetic network. Coupling the MSNC with a stochastic mutational process that operates along the branches of the gene trees gives rise to a generative model of how multiple loci from within and across species evolve in the presence of both incomplete lineage sorting (ILS) and reticulation (e.g., hybridization). We report on a Bayesian method for sampling the parameters of this generative model, including the species phylogeny, gene trees, divergence times, and population sizes, from DNA sequences of multiple independent loci. We demonstrate the utility of our method by analyzing simulated data and reanalyzing an empirical data set. Our results demonstrate the significance of not only coestimating species phylogenies and gene trees, but also accounting for reticulation and ILS simultaneously. In particular, we show that when gene flow occurs, our method accurately estimates the evolutionary histories, coalescence times, and divergence times. Tree inference methods, on the other hand, underestimate divergence times and overestimate coalescence times when the evolutionary history is reticulate. While the MSNC corresponds to an abstract model of "intermixture," we study the performance of the model and method on simulated data generated under a gene flow model. We show that the method accurately infers the most recent time at which gene flow occurs. Finally, we demonstrate the application of the new method to a 106-locus yeast data set.

摘要

多物种网络合并(MSNC)是一种随机过程,用于捕捉基因树在系统发育网络分支内的生长方式。将 MSNC 与沿着基因树分支运作的随机突变过程相结合,就产生了一个关于在不完全谱系分选(ILS)和网状进化(例如杂交)存在的情况下,来自物种内和跨物种的多个基因座如何进化的生成模型。我们报告了一种贝叶斯方法,用于从多个独立基因座的 DNA 序列中对该生成模型的参数进行采样,包括物种系统发育、基因树、分歧时间和种群大小。我们通过分析模拟数据和重新分析一个经验数据集来证明我们方法的有效性。我们的结果不仅证明了共同估计物种系统发育和基因树的重要性,还证明了同时考虑网状进化和 ILS 的重要性。特别是,我们表明,当发生基因流时,我们的方法可以准确估计进化历史、合并时间和分歧时间。另一方面,当进化历史是网状时,树推断方法会低估分歧时间并高估合并时间。虽然 MSNC 对应于“混合”的抽象模型,但我们研究了在基因流模型下生成的模拟数据中模型和方法的性能。我们表明,该方法可以准确推断出最近一次发生基因流的时间。最后,我们展示了新方法在一个 106 个基因座酵母数据集上的应用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验