Suppr超能文献

利用单核苷酸多态性(SNP)数据对样本量对鸟类种群人口统计学估计的影响进行实证检验。

An empirical examination of sample size effects on population demographic estimates in birds using single nucleotide polymorphism (SNP) data.

作者信息

McLaughlin Jessica F, Winker Kevin

机构信息

University of Alaska Museum & Department of Biology and Wildlife, University of Alaska Fairbanks, Fairbanks, AK, USA.

Sam Noble Oklahoma Museum of Natural History and Department of Biology, University of Oklahoma, Norman, OK, USA.

出版信息

PeerJ. 2020 Sep 16;8:e9939. doi: 10.7717/peerj.9939. eCollection 2020.

Abstract

Sample size is a critical aspect of study design in population genomics research, yet few empirical studies have examined the impacts of small sample sizes. We used datasets from eight diverging bird lineages to make pairwise comparisons at different levels of taxonomic divergence (populations, subspecies, and species). Our data are from loci linked to ultraconserved elements and our analyses used one single nucleotide polymorphism per locus. All individuals were genotyped at all loci, effectively doubling sample size for coalescent analyses. We estimated population demographic parameters (effective population size, migration rate, and time since divergence) in a coalescent framework using Diffusion Approximation for Demographic Inference, an allele frequency spectrum method. Using divergence-with-gene-flow models optimized with full datasets, we subsampled at sequentially smaller sample sizes from full datasets of 6-8 diploid individuals per population (with both alleles called) down to 1:1, and then we compared estimates and their changes in accuracy. Accuracy was strongly affected by sample size, with considerable differences among estimated parameters and among lineages. Effective population size parameters () tended to be underestimated at low sample sizes (fewer than three diploid individuals per population, or 6:6 haplotypes in coalescent terms). Migration () was fairly consistently estimated until <2 individuals per population, and no consistent trend of over-or underestimation was found in either time since divergence () or theta (Θ = 4 μ). Lineages that were taxonomically recognized above the population level (subspecies and species pairs; that is, deeper divergences) tended to have lower variation in scaled root mean square error of parameter estimation at smaller sample sizes than population-level divergences, and many parameters were estimated accurately down to three diploid individuals per population. Shallower divergence levels (i.e., populations) often required at least five individuals per population for reliable demographic inferences using this approach. Although divergence levels might be unknown at the outset of study design, our results provide a framework for planning appropriate sampling and for interpreting results if smaller sample sizes must be used.

摘要

样本量是群体基因组学研究中研究设计的一个关键方面,但很少有实证研究考察过小样本量的影响。我们使用了来自八个不同鸟类谱系的数据集,在不同分类学分歧水平(种群、亚种和物种)上进行成对比较。我们的数据来自与超保守元件相关的基因座,我们的分析在每个基因座使用一个单核苷酸多态性。所有个体在所有基因座上都进行了基因分型,有效地使合并分析的样本量增加了一倍。我们使用群体推断的扩散近似法(一种等位基因频率谱方法),在合并框架中估计群体人口统计学参数(有效群体大小、迁移率和分歧时间)。使用用完整数据集优化的带基因流的分歧模型,我们从每个群体6 - 8个二倍体个体的完整数据集中按顺序以更小的样本量进行二次抽样(两个等位基因都被调用),直至1:1,然后我们比较估计值及其准确性变化。准确性受到样本量的强烈影响,估计参数之间以及谱系之间存在相当大的差异。在低样本量时(每个群体少于三个二倍体个体,或以合并术语表示为6:6个单倍型),有效群体大小参数往往被低估。迁移率在每个群体少于2个个体之前估计相当一致,在分歧时间或θ(Θ = 4μ)方面,未发现一致的高估或低估趋势。在分类学上高于种群水平被认可的谱系(亚种和物种对;即更深的分歧)在较小样本量下,参数估计的缩放均方根误差变化往往比种群水平的分歧小,并且许多参数在每个群体低至三个二倍体个体时仍能准确估计。使用这种方法进行可靠的人口统计学推断时,较浅的分歧水平(即种群)通常每个群体至少需要五个个体。尽管在研究设计开始时分歧水平可能未知,但我们的结果为规划适当的抽样以及在必须使用较小样本量时解释结果提供了一个框架。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f64/7501783/bba0ed2156ed/peerj-08-9939-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验