Zhang Wei, Duan Shiwei, Dolan M Eileen
Section of Hematology/Oncology, Department of Medicine, Cancer Research Center, The University of Chicago, IL 60637, USA.
Bioinformation. 2008 May 13;2(8):322-4. doi: 10.6026/97320630002322.
The International HapMap Project provides a resource of genotypic data on single nucleotide polymorphisms (SNPs), which can be used in various association studies to identify the genetic determinants for phenotypic variations. Prior to the association studies, the HapMap dataset should be preprocessed in order to reduce the computation time and control the multiple testing problem. The less informative SNPs including those with very low genotyping rate and SNPs with rare minor allele frequencies to some extent in one or more population are removed. Some research designs only use SNPs in a subset of HapMap cell lines. Although the HapMap website and other association software packages have provided some basic tools for optimizing these datasets, a fast and user-friendly program to generate the output for filtered genotypic data would be beneficial for association studies. Here, we present a flexible, straight-forward bioinformatics program that can be useful in preparing the HapMap genotypic data for association studies by specifying cell lines and two common filtering criteria: minor allele frequencies and genotyping rate. The software was developed for Microsoft Windows and written in C++.
The Windows executable and source code in Microsoft Visual C++ are available at Google Code (http://hapmap-filter-v1.googlecode.com/) or upon request. Their distribution is subject to GNU General Public License v3.
国际人类基因组单体型图计划(International HapMap Project)提供了单核苷酸多态性(SNP)的基因型数据资源,可用于各种关联研究以确定表型变异的遗传决定因素。在进行关联研究之前,应对HapMap数据集进行预处理,以减少计算时间并控制多重检验问题。去除信息较少的SNP,包括那些基因分型率非常低的SNP以及在一个或多个群体中某种程度上具有罕见次要等位基因频率的SNP。一些研究设计仅使用HapMap细胞系子集中的SNP。尽管HapMap网站和其他关联软件包提供了一些用于优化这些数据集的基本工具,但一个快速且用户友好的程序来生成过滤后的基因型数据输出将对关联研究有益。在此,我们展示了一个灵活、简单的生物信息学程序,通过指定细胞系以及两个常见的过滤标准:次要等位基因频率和基因分型率,可用于为关联研究准备HapMap基因型数据。该软件是为Microsoft Windows开发的,用C++编写。
Windows可执行文件和Microsoft Visual C++中的源代码可在Google Code(http://hapmap-filter-v1.googlecode.com/)获取或按需提供。它们的分发遵循GNU通用公共许可证v3。