Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília, DF, Brazil.
J Biomed Sci. 2009 Aug 14;16(1):73. doi: 10.1186/1423-0127-16-73.
The application of a subset of single nucleotide polymorphisms, the tagSNPs, can be useful in capturing untyped SNPs information in a genomic region. TagSNP transferability from the HapMap dataset to admixed populations is of uncertain value due population structure, admixture, drift and recombination effects. In this work an empirical dataset from a Brazilian admixed sample was evaluated against the HapMap population to measure tagSNP transferability and the relative loss of variability prediction.
The transferability study was carried out using SNPs dispersed over four genomic regions: the PTPN22, HMGCR, VDR and CETP genes. Variability coverage and the prediction accuracy for tagSNPs in the selected genomic regions of HapMap phase II were computed using a prediction accuracy algorithm. Transferability of tagSNPs and relative loss of prediction were evaluated according to the difference between the Brazilian sample and the pooled and single HapMap population estimates.
Each population presented different levels of prediction per gene. On average, the Brazilian (BRA) sample displayed a lower power of prediction when compared to HapMap and the pooled sample. There was a relative loss of prediction for BRA when using single HapMap populations, but a pooled HapMap dataset generated minor loss of variability prediction and lower standard deviations, except at the VDR locus at which loss was minor using CEU tagSNPs.
Studies that involve tagSNP selection for an admixed population should not be generally correlated with any specific HapMap population and can be better represented with a pooled dataset in most cases.
应用单核苷酸多态性的子集,即标签 SNP,可以有效地捕捉基因组区域中未分型的 SNP 信息。由于群体结构、混合、漂变和重组等因素,标签 SNP 从 HapMap 数据集向混合人群的转移能力具有不确定性。在这项工作中,我们使用来自巴西混合人群的经验数据集来评估与 HapMap 人群的标签 SNP 转移能力和相对变异预测损失。
使用分布在四个基因组区域的 SNPs 进行转移能力研究:PTPN22、HMGCR、VDR 和 CETP 基因。使用预测准确性算法计算 HapMap 二期选择基因组区域中 SNP 的变异性覆盖和标签 SNP 的预测准确性。根据巴西样本与混合和单个 HapMap 群体估计值之间的差异,评估标签 SNP 的转移能力和相对预测损失。
每个群体在每个基因上的预测能力水平不同。平均而言,与 HapMap 和混合样本相比,巴西(BRA)样本的预测能力较低。当使用单个 HapMap 群体时,BRA 存在相对预测损失,但使用混合 HapMap 数据集仅会导致较小的变异性预测损失和较低的标准偏差,除了在 VDR 基因座上,CEU 标签 SNP 的损失较小。
涉及混合人群的标签 SNP 选择的研究不应普遍与任何特定的 HapMap 人群相关,并且在大多数情况下,使用混合数据集可以更好地代表。