Suppr超能文献

下一代测序在分子生态学中的应用:关于混合样本的注意事项。

Next-generation sequencing for molecular ecology: a caveat regarding pooled samples.

机构信息

Fisheries Ecology Division, Southwest Fisheries Science Center, National Marine Fisheries Service, NOAA, 110 Shaffer Road, Santa Cruz, CA, 95060, USA; Department of Applied Math and Statistics (SOE2), University of California, 1156 High Street, Santa Cruz, CA, 95064, USA.

出版信息

Mol Ecol. 2014 Feb;23(3):502-12. doi: 10.1111/mec.12609.

Abstract

We develop a model based on the Dirichlet-compound multinomial distribution (CMD) and Ewens sampling formula to predict the fraction of SNP loci that will appear fixed for alternate alleles between two pooled samples drawn from the same underlying population. We apply this model to next-generation sequencing (NGS) data from Baltic Sea herring recently published by (Corander et al., 2013, Molecular Ecology, 2931-2940), and show that there are many more fixed loci than expected in the absence of genetic structure. However, we show through coalescent simulations that the degree of population structure required to explain the fraction of alternatively fixed SNPs is extraordinarily high and that the surplus of fixed loci is more likely a consequence of limited representation of individual gene copies in the pooled samples, than it is of population structure. Our analysis signals that the use of NGS on pooled samples to identify divergent SNPs warrants caution. With pooled samples, it is hard to diagnose when an NGS experiment has gone awry; especially when NGS data on pooled samples are of low read depth with a limited number of individuals, it may be worthwhile to temper claims of unexpected population differentiation from pooled samples, pending verification with more reliable methods or stricter adherence to recommended sampling designs for pooled sequencing e.g. Futschik & Schlötterer 2010, Genetics, 186, 207; Gautier et al., 2013a, Molecular Ecology, 3766-3779). Analysis of the data and diagnosis of problems is easier and more reliable (and can be less costly) with individually barcoded samples. Consequently, for some scenarios, individual barcoding may be preferable to pooling of samples.

摘要

我们开发了一个基于狄利克雷复合多项分布(CMD)和Ewens 抽样公式的模型,用于预测从同一基础群体中抽取的两个混合样本中,替代等位基因固定的 SNP 位点比例。我们将该模型应用于最近由(Corander 等人,2013 年,分子生态学,2931-2940)发表的波罗的海鲱鱼的下一代测序(NGS)数据,并表明在没有遗传结构的情况下,固定的基因座数量远远超过预期。然而,我们通过连锁模拟表明,解释替代固定 SNP 比例所需的群体结构程度非常高,并且固定基因座的过剩更可能是混合样本中个体基因拷贝代表性有限的结果,而不是群体结构的结果。我们的分析表明,在混合样本中使用 NGS 来识别分歧 SNP 需要谨慎。在混合样本中,很难诊断 NGS 实验何时出现问题;特别是当混合样本的 NGS 数据深度较低且个体数量有限时,在使用更可靠的方法或更严格地遵守混合测序推荐的采样设计(例如,Futschik & Schlötterer,2010,遗传学,186,207;Gautier 等人,2013a,分子生态学,3766-3779)进行验证之前,最好不要轻易断言混合样本中存在意外的种群分化。分析数据和诊断问题在单独标记样本中更容易和更可靠(并且可能成本更低)。因此,在某些情况下,个体标记可能优于样本混合。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验