Stefanov Stefan, Lautenberger James, Gold Bert
Human Genetics Section, Laboratory of Experimental Immunology, National Cancer Institute at Frederick, Frederick, MD 21702, USA.
Cancer Inform. 2008;6:455-61. doi: 10.4137/cin.s966. Epub 2008 Sep 24.
We developed an efficient pipeline to analyze genome-wide association study single nucleotide polymorphism scan results. Purl scripts were used to convert genotypes called using the BRLMM algorithm into a modified PB format. We computed summary statistics characteristic of our case and control populations including allele counts, missing values, heterozygosity, measures of compliance with Hardy-Weinberg equilibrium, and several population difference statistics. In addition, we computed association tests, including exact tests of association for genotypes, alleles, the Cochran-Armitage linear trend test, and dominant, recessive, and over dominant models at every single nucleotide polymorphism (SNP). In addition, pairwise linkage disequilibrium statistics were elaborated, using the command line version of HaploView, which was possible by writing a reformatting script. Additional Perl scripts permit loading the results into a MySQL database conjoined with a Generic Genome Browser (gbrowse) for comprehensive visualization. This browser incorporates a download feature that provides actual case and control genotypes to users in associated genomic regions. Thus, re-analysis "on the fly" is possible for casual browser users from anywhere on the Internet.
我们开发了一种高效的流程来分析全基因组关联研究单核苷酸多态性扫描结果。使用Purl脚本将通过BRLMM算法调用的基因型转换为改良的PB格式。我们计算了病例和对照人群的汇总统计特征,包括等位基因计数、缺失值、杂合性、哈迪-温伯格平衡的符合度测量以及几种群体差异统计量。此外,我们进行了关联测试,包括对基因型、等位基因的精确关联测试、 Cochr an-Armitage线性趋势测试以及每个单核苷酸多态性(SNP)处的显性、隐性和超显性模型。此外,使用HaploView的命令行版本详细阐述了成对连锁不平衡统计量,这通过编写一个重新格式化脚本得以实现。额外的Perl脚本允许将结果加载到与通用基因组浏览器(gbrowse)相连的MySQL数据库中,以进行全面可视化。该浏览器具有一个下载功能,可为相关基因组区域的用户提供实际的病例和对照基因型。因此,来自互联网上任何地方的普通浏览器用户都可以“即时”进行重新分析。