Sun Yan V, Jacobsen Douglas M, Turner Stephen T, Boerwinkle Eric, Kardia Sharon L R
Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, Michigan.
Comput Stat Data Anal. 2009 Mar 15;53(5):1794-1801. doi: 10.1016/j.csda.2008.04.013.
In order to take into account the complex genomic distribution of SNP variations when identifying chromosomal regions with significant SNP effects, a single nucleotide polymorphism (SNP) association scan statistic was developed. To address the computational needs of genome wide association (GWA) studies, a fast Java application, which combines single-locus SNP tests and a scan statistic for identifying chromosomal regions with significant clusters of significant SNP effects, was developed and implemented. To illustrate this application, SNP associations were analyzed in a pharmacogenomic study of the blood pressure lowering effect of thiazide-diuretics (N=195) using the Affymetrix Human Mapping 100K Set. 55,335 tagSNPs (pair-wise linkage disequilibrium R(2)<0.5) were selected to reduce the frequency correlation between SNPs. A typical workstation can complete the whole genome scan including 10,000 permutation tests within 3 hours. The most significant regions locate on chromosome 3, 6, 13 and 16, two of which contain candidate genes that may be involved in the underlying drug response mechanism. The computational performance of ChromoScan-GWA and its scalability were tested with up to 1,000,000 SNPs and up to 4,000 subjects. Using 10,000 permutations, the computation time grew linearly in these datasets. This scan statistic application provides a robust statistical and computational foundation for identifying genomic regions associated with disease and provides a method to compare GWA results even across different platforms.
为了在识别具有显著单核苷酸多态性(SNP)效应的染色体区域时考虑SNP变异的复杂基因组分布,开发了一种单核苷酸多态性(SNP)关联扫描统计量。为满足全基因组关联(GWA)研究的计算需求,开发并实现了一个快速Java应用程序,该程序结合了单基因座SNP测试和一种扫描统计量,用于识别具有显著SNP效应簇的染色体区域。为说明此应用程序,在一项使用Affymetrix Human Mapping 100K Set进行的噻嗪类利尿剂降压效果的药物基因组学研究(N = 195)中分析了SNP关联。选择了55,335个标签SNP(成对连锁不平衡R(2)<0.5)以降低SNP之间的频率相关性。一台典型的工作站可以在3小时内完成包括10,000次置换检验的全基因组扫描。最显著的区域位于3号、6号、13号和16号染色体上,其中两个区域包含可能参与潜在药物反应机制的候选基因。使用多达1,000,000个SNP和多达4,000名受试者对ChromoScan-GWA的计算性能及其可扩展性进行了测试。使用10,000次置换,在这些数据集中计算时间呈线性增长。这种扫描统计量应用程序为识别与疾病相关的基因组区域提供了强大的统计和计算基础,并提供了一种即使在不同平台之间比较GWA结果的方法。