Suppr超能文献

从群体遗传数据集中剔除假定的同胞:一种警示观点。

Purging putative siblings from population genetic data sets: a cautionary view.

作者信息

Waples Robin S, Anderson Eric C

机构信息

NOAA Fisheries, Northwest Fisheries Science Center, 2725 Montlake Blvd. East, Seattle, WA, 98112, USA.

NOAA Fisheries, Southwest Fisheries Science Center, 110 McAllister Way, Santa Cruz, CA, 95060, USA.

出版信息

Mol Ecol. 2017 Mar;26(5):1211-1224. doi: 10.1111/mec.14022. Epub 2017 Feb 6.

Abstract

UNLABELLED

Interest has surged recently in removing siblings from population genetic data sets before conducting downstream analyses. However, even if the pedigree is inferred correctly, this has the potential to do more harm than good. We used computer simulations and empirical samples of coho salmon to evaluate strategies for adjusting samples to account for family structure. We compared performance in full samples and sibling-reduced samples of estimators of allele frequency (P^), population differentiation (F^ST) and effective population size (N^e).

RESULTS

(i) unless simulated samples included large family groups together with a component of unrelated individuals, removing siblings generally reduced precision of P^ and F^ST; (ii) N^e based on the linkage disequilibrium method was largely unbiased using full random samples but became increasingly upwardly biased under aggressive purging of siblings. Under nonrandom sampling (some families over-represented), N^e using full samples was downwardly biased; removing just the right 'Goldilocks' fraction of siblings could produce an unbiased estimate, but this sweet spot varied widely among scenarios; (iii) weighting individuals based on the inferred pedigree (to produce a best linear unbiased estimator, BLUE) maximized precision of P^ when the inferred pedigree was correct but performed poorly when the pedigree was wrong; (iv) a variant of sibling removal that leaves intact small sibling groups appears to be more robust to errors in inferences about family structure. Our results illustrate the complex challenges posed by presence of family structure, suggest that no single optimal solution exists and argue for caution in adjusting population genetic data sets for the presence of putative siblings without fully understanding the consequences.

摘要

未标注

最近,在进行下游分析之前从群体遗传数据集中去除同胞个体的做法引发了广泛关注。然而,即便系谱推断正确,这样做也可能弊大于利。我们利用计算机模拟和银大麻哈鱼的实证样本,评估了针对家族结构调整样本的策略。我们比较了完整样本和去除同胞个体后的样本在估计等位基因频率((P^))、群体分化((F^\mathrm{ST}))和有效群体大小((N^\mathrm{e}))方面的表现。

结果

(i)除非模拟样本包含大型家族群体以及一部分无关个体,否则去除同胞个体通常会降低(P^)和(F^\mathrm{ST})的估计精度;(ii)基于连锁不平衡方法的(N^\mathrm{e})在使用完全随机样本时基本无偏,但在积极去除同胞个体的情况下会出现越来越大的向上偏差。在非随机抽样(某些家族代表性过高)时,使用完整样本的(N^\mathrm{e})存在向下偏差;去除恰好合适的“金发姑娘”比例的同胞个体可得到无偏估计,但这个最佳比例在不同情况下差异很大;(iii)当推断的系谱正确时,基于推断系谱对个体进行加权(以产生最佳线性无偏估计,BLUE)可使(P^)的估计精度最大化,但当系谱错误时效果不佳;(iv)一种保留小同胞群体完整的去除同胞个体的变体方法,似乎对家族结构推断中的错误更具鲁棒性。我们的结果说明了家族结构带来的复杂挑战,表明不存在单一的最优解决方案,并主张在未充分理解后果的情况下,谨慎调整存在假定同胞个体的群体遗传数据集。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验