Thomas Alun
Department of Medical Informatics and Center for High Performance Computing, University of Utah, 391 Chipeta Way Suite D, Salt Lake City, UT 84108, USA.
Bioinformatics. 2003 Oct 12;19(15):2002-3. doi: 10.1093/bioinformatics/btg254.
GCHap quickly finds maximum likelihood estimates (MLEs) of frequencies of haplotypes given genotype information on a random sample of individuals. It uses the gene counting method but by excluding haplotypes with zero MLE at an early stage, this implementation uses many orders of magnitude less space and time than naive implementations. A second program, ApproxGCHap, is provided to give alternate estimates for data sets with large numbers of loci or large amounts of missing genotypes.
The Java classes and Javadocs pages for GCHap can be obtained from bioinformatics.med.utah.edu/~alun
给定个体随机样本的基因型信息,GCHap能快速找到单倍型频率的最大似然估计值(MLE)。它使用基因计数法,但通过在早期排除MLE为零的单倍型,与朴素实现相比,此实现使用的空间和时间要少很多个数量级。还提供了第二个程序ApproxGCHap,用于对具有大量位点或大量缺失基因型的数据集给出替代估计。
GCHap的Java类和Javadocs页面可从bioinformatics.med.utah.edu/~alun获取