Suppr超能文献

SNP 芯片中的确定偏差会影响种群分歧的度量。

Ascertainment biases in SNP chips affect measures of population divergence.

机构信息

Department of Biostatistics, Copenhagen University, Copenhagen, Denmark.

出版信息

Mol Biol Evol. 2010 Nov;27(11):2534-47. doi: 10.1093/molbev/msq148. Epub 2010 Jun 17.

Abstract

Chip-based high-throughput genotyping has facilitated genome-wide studies of genetic diversity. Many studies have utilized these large data sets to make inferences about the demographic history of human populations using measures of genetic differentiation such as F(ST) or principal component analyses. However, the single nucleotide polymorphism (SNP) chip data suffer from ascertainment biases caused by the SNP discovery process in which a small number of individuals from selected populations are used as discovery panels. In this study, we investigate the effect of the ascertainment bias on inferences regarding genetic differentiation among populations in one of the common genome-wide genotyping platforms. We generate SNP genotyping data for individuals that previously have been subject to partial genome-wide Sanger sequencing and compare inferences based on genotyping data to inferences based on direct sequencing. In addition, we also analyze publicly available genome-wide data. We demonstrate that the ascertainment biases will distort measures of human diversity and possibly change conclusions drawn from these measures in some times unexpected ways. We also show that details of the genotyping calling algorithms can have a surprisingly large effect on population genetic inferences. We not only present a correction of the spectrum for the widely used Affymetrix SNP chips but also show that such corrections are difficult to generalize among studies.

摘要

基于芯片的高通量基因分型促进了全基因组范围内遗传多样性的研究。许多研究利用这些大型数据集,通过衡量遗传分化的指标,如 F(ST)或主成分分析,来推断人类群体的人口历史。然而,单核苷酸多态性 (SNP) 芯片数据存在由 SNP 发现过程引起的确定偏差,在该过程中,从选定的人群中选择少数个体作为发现面板。在这项研究中,我们研究了确定偏差对一种常见全基因组基因分型平台中群体间遗传分化推断的影响。我们为先前进行过部分全基因组 Sanger 测序的个体生成 SNP 基因分型数据,并比较基于基因分型数据的推断和基于直接测序的推断。此外,我们还分析了公开可用的全基因组数据。我们证明确定偏差会扭曲人类多样性的衡量标准,并可能以一些意想不到的方式改变从这些衡量标准中得出的结论。我们还表明,基因分型调用算法的细节会对群体遗传推断产生惊人的影响。我们不仅提出了一种广泛使用的 Affymetrix SNP 芯片的校正方法,还表明这种校正方法很难在不同的研究中推广。

相似文献

引用本文的文献

本文引用的文献

1
ESTIMATING F-STATISTICS FOR THE ANALYSIS OF POPULATION STRUCTURE.估计用于群体结构分析的F统计量
Evolution. 1984 Nov;38(6):1358-1370. doi: 10.1111/j.1558-5646.1984.tb05657.x.
3
Correcting for ascertainment bias in the inference of population structure.在群体结构推断中校正确认偏倚。
Bioinformatics. 2009 Feb 15;25(4):552-4. doi: 10.1093/bioinformatics/btn665. Epub 2009 Jan 9.
6
Genes mirror geography within Europe.基因反映了欧洲内部的地理特征。
Nature. 2008 Nov 6;456(7218):98-101. doi: 10.1038/nature07331. Epub 2008 Aug 31.
7
Correlation between genetic and geographic structure in Europe.欧洲基因结构与地理结构之间的相关性。
Curr Biol. 2008 Aug 26;18(16):1241-8. doi: 10.1016/j.cub.2008.07.049. Epub 2008 Aug 7.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验