Suppr超能文献

基于多物种 coalescent 模型的系统基因组数据集信息含量的模拟研究

A Simulation Study to Examine the Information Content in Phylogenomic Data Sets under the Multispecies Coalescent Model.

机构信息

Department of Genetics, Evolution and Environment, University College London, London, United Kingdom.

Department of Mathematics, Beijing Jiaotong University, Beijing, P.R. China.

出版信息

Mol Biol Evol. 2020 Nov 1;37(11):3211-3224. doi: 10.1093/molbev/msaa166.

Abstract

We use computer simulation to examine the information content in multilocus data sets for inference under the multispecies coalescent model. Inference problems considered include estimation of evolutionary parameters (such as species divergence times, population sizes, and cross-species introgression probabilities), species tree estimation, and species delimitation based on Bayesian comparison of delimitation models. We found that the number of loci is the most influential factor for almost all inference problems examined. Although the number of sequences per species does not appear to be important to species tree estimation, it is very influential to species delimitation. Increasing the number of sites and the per-site mutation rate both increase the mutation rate for the whole locus and these have the same effect on estimation of parameters, but the sequence length has a greater effect than the per-site mutation rate for species tree estimation. We discuss the computational costs when the data size increases and provide guidelines concerning the subsampling of genomic data to enable the application of full-likelihood methods of inference.

摘要

我们使用计算机模拟来研究多基因座数据集在多物种合并模型下的信息含量,以进行推断。所考虑的推断问题包括进化参数(如物种分歧时间、种群大小和跨物种基因渗入概率)的估计、物种树估计以及基于划分模型的贝叶斯比较的物种划分。我们发现,对于几乎所有被检查的推断问题,基因座数量是最具影响力的因素。虽然每个物种的序列数量对物种树估计似乎不重要,但对物种划分非常重要。增加位点数量和每个位点的突变率都会增加整个基因座的突变率,这对参数估计有相同的影响,但序列长度对物种树估计的影响大于每个位点的突变率。我们讨论了当数据大小增加时的计算成本,并提供了关于基因组数据抽样的指南,以使全似然推断方法得以应用。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验