Yan Bin, Wang Shudong, Jia Huaqian, Liu Xing, Wang Xinzeng
College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao, Shandong, 266590, China.
College of Computer and Communication Engineering, China University of Petroleum, Qingdao, Shandong, 266580, China.
BMC Genet. 2015 Mar 13;16:25. doi: 10.1186/s12863-015-0182-3.
Single-nucleotide polymorphism (SNP)-set analysis in Genome-wide association studies (GWAS) has emerged as a research hotspot for identifying genetic variants associated with disease susceptibility. But most existing methods of SNP-set analysis are affected by the quality of SNP-set, and poor quality of SNP-set can lead to low power in GWAS.
In this research, we propose an efficient weighted tag-SNP-set analytical method to detect the disease associations. In our method, we first design a fast algorithm to select a subset of SNPs (called tag SNP-set) from a given original SNP-set based on the linkage disequilibrium (LD) between SNPs, then assign a proper weight to each of the selected tag SNP respectively and test the joint effect of these weighted tag SNPs. The intensive simulation results show that the power of weighted tag SNP-set-based test is much higher than that of weighted original SNP-set-based test and that of un-weighted tag SNP-set-based test. We also compare the powers of the weighted tag SNP-set-based test based on four types of tag SNP-sets. The simulation results indicate the method of selecting tag SNP-set impacts the power greatly and the power of our proposed method is the highest.
From the analysis of simulated replicated data sets, we came to a conclusion that weighted tag SNP-set-based test is a powerful SNP-set test in GWAS. We also designed a faster algorithm of selecting tag SNPs which include most of information of original SNP-set, and a better weighted function which can describe the status of each tag SNP in GWAS.
全基因组关联研究(GWAS)中的单核苷酸多态性(SNP)集分析已成为识别与疾病易感性相关的遗传变异的研究热点。但现有的大多数SNP集分析方法都受SNP集质量的影响,而SNP集质量不佳会导致GWAS功效较低。
在本研究中,我们提出了一种有效的加权标签SNP集分析方法来检测疾病关联。在我们的方法中,我们首先设计了一种快速算法,基于SNP之间的连锁不平衡(LD)从给定的原始SNP集中选择一个SNP子集(称为标签SNP集),然后分别为每个选定的标签SNP赋予适当的权重,并测试这些加权标签SNP的联合效应。大量模拟结果表明,基于加权标签SNP集的检验功效远高于基于加权原始SNP集的检验和基于未加权标签SNP集的检验。我们还比较了基于四种类型标签SNP集的加权标签SNP集检验的功效。模拟结果表明,标签SNP集的选择方法对功效有很大影响,我们提出的方法功效最高。
通过对模拟重复数据集的分析,我们得出结论,基于加权标签SNP集的检验是GWAS中一种强大的SNP集检验方法。我们还设计了一种更快的选择标签SNP的算法,该算法包含原始SNP集的大部分信息,以及一个更好的加权函数,该函数可以描述每个标签SNP在GWAS中的状态。