Yoo Jinho, Lee Youngbok, Kim Yujung, Rha Sun Young, Kim Yangseok
Cancer Metastasis Research Center, Yonsei University College of Medicine, Seoul, Republic of Korea.
BMC Bioinformatics. 2008 Jun 23;9:290. doi: 10.1186/1471-2105-9-290.
Since the completion of the HapMap project, huge numbers of individual genotypes have been generated from many kinds of laboratories. The efforts of finding or interpreting genetic association between disease and SNPs/haplotypes have been on-going widely. So, the necessity of the capability to analyze huge data and diverse interpretation of the results are growing rapidly.
We have developed an advanced tool to perform linkage disequilibrium analysis, and genetic association analysis between disease and SNPs/haplotypes in an integrated web interface. It comprises of four main analysis modules: (i) data import and preprocessing, (ii) haplotype estimation, (iii) LD blocking and (iv) association analysis. Hardy-Weinberg Equilibrium test is implemented for each SNPs in the data preprocessing. Haplotypes are reconstructed from unphased diploid genotype data, and linkage disequilibrium between pairwise SNPs is computed and represented by D', r2 and LOD score. Tagging SNPs are determined by using the square of Pearson's correlation coefficient (r2). If genotypes from two different sample groups are available, diverse genetic association analyses are implemented using additive, codominant, dominant and recessive models. Multiple verified algorithms and statistics are implemented in parallel for the reliability of the analysis.
SNPAnalyzer 2.0 performs linkage disequilibrium analysis and genetic association analysis in an integrated web interface using multiple verified algorithms and statistics. Diverse analysis methods, capability of handling huge data and visual comparison of analysis results are very comprehensive and easy-to-use.
自HapMap计划完成以来,许多实验室已生成了大量个体基因型数据。寻找或解释疾病与单核苷酸多态性(SNP)/单倍型之间遗传关联的工作一直在广泛开展。因此,分析海量数据的能力以及对结果进行多样化解读的需求迅速增长。
我们开发了一种先进工具,可在集成的网络界面中进行连锁不平衡分析以及疾病与SNP/单倍型之间的遗传关联分析。它由四个主要分析模块组成:(i)数据导入与预处理,(ii)单倍型估计,(iii)连锁不平衡阻断,以及(iv)关联分析。在数据预处理过程中,对每个SNP进行哈迪-温伯格平衡检验。从未分型的二倍体基因型数据重建单倍型,并计算成对SNP之间的连锁不平衡,并用D'、r2和LOD得分表示。通过使用皮尔逊相关系数(r2)的平方来确定标签SNP。如果有来自两个不同样本组的基因型数据,则使用加性、共显性、显性和隐性模型进行多种遗传关联分析。为确保分析的可靠性,并行实施了多种经过验证的算法和统计方法。
SNPAnalyzer 2.0在集成的网络界面中使用多种经过验证的算法和统计方法进行连锁不平衡分析和遗传关联分析。多样的分析方法、处理海量数据的能力以及分析结果的可视化比较都非常全面且易于使用。