Guo Yanfang, Li Jian, Bonham Aaron J, Wang Yuping, Deng Hongwen
The Key Laboratory of Biomedical Information Engineering of Ministry of Education and Institute of Molecular Genetics, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, PR China.
Eur J Hum Genet. 2009 Jun;17(6):785-92. doi: 10.1038/ejhg.2008.244. Epub 2008 Dec 17.
Linkage disequilibrium (LD)-based association mapping is often performed by analyzing either individual SNPs or block-based multi-SNP haplotypes. Sliding windows of several fixed sizes (in terms of SNP numbers) were also applied to a few simulated or real data sets. In comparison, exhaustively testing based on variable-sized sliding windows (VSW) of all possible sizes of SNPs over a genomic region has the best chance to capture the optimum markers (single SNPs or haplotypes) that are most significantly associated with the traits under study. However, the cost is the increased number of multiple tests and computation. Here, a strategy of VSW of all possible sizes is proposed and its power is examined, in comparison with those using only haplotype blocks (BLK) or single SNP loci (SGL) tests. Critical values for statistical significance testing that account for multiple testing are simulated. We demonstrated that, over a wide range of parameters simulated, VSW increased power for the detection of disease variants by approximately 1-15% over the BLK and SGL approaches. The improved performance was more significant in regions with high recombination rates. In an empirical data set, VSW obtained the most significant signal and identified the LRP5 gene as strongly associated with osteoporosis. With the use of computational techniques such as parallel algorithms and clustering computing, it is feasible to apply VSW to large genomic regions or those regions preliminarily identified by traditional SGL/BLK methods.
基于连锁不平衡(LD)的关联作图通常通过分析单个单核苷酸多态性(SNP)或基于块的多SNP单倍型来进行。几种固定大小(以SNP数量计)的滑动窗口也被应用于一些模拟或真实数据集。相比之下,在基因组区域上对所有可能大小的SNP进行基于可变大小滑动窗口(VSW)的详尽测试,最有可能捕获与所研究性状最显著相关的最佳标记(单个SNP或单倍型)。然而,代价是多重检验和计算的数量增加。在此,我们提出了一种针对所有可能大小的VSW策略,并与仅使用单倍型块(BLK)或单个SNP位点(SGL)检验的策略相比,检验了其效能。模拟了考虑多重检验的统计显著性检验的临界值。我们证明,在广泛的模拟参数范围内,与BLK和SGL方法相比,VSW将疾病变异检测的效能提高了约1% - 15%。在重组率高的区域,性能的提升更为显著。在一个实证数据集中,VSW获得了最显著的信号,并确定低密度脂蛋白受体相关蛋白5(LRP5)基因与骨质疏松症密切相关。通过使用并行算法和聚类计算等计算技术,将VSW应用于大型基因组区域或那些通过传统SGL/BLK方法初步鉴定的区域是可行的。