Department of Bioinformatics, University of Tehran, Tehran, Iran.
J Biomed Inform. 2010 Oct;43(5):800-4. doi: 10.1016/j.jbi.2010.05.011. Epub 2010 May 28.
Single Nucleotide Polymorphisms (SNPs) provide valuable information on human evolutionary history and may lead us to identify genetic variants responsible for human complex diseases. Unfortunately, molecular haplotyping methods are costly, laborious, and time consuming; therefore, algorithms for constructing full haplotype patterns from small available data through computational methods, Tag SNP selection problem, are convenient and attractive. This problem is proved to be an NP-hard problem, so heuristic methods may be useful. In this paper we present a heuristic method based on genetic algorithm to find reasonable solution within acceptable time. The algorithm was tested on a variety of simulated and experimental data. In comparison with the exact algorithm, based on brute force approach, results show that our method can obtain optimal solutions in almost all cases and runs much faster than exact algorithm when the number of SNP sites is large. Our software is available upon request to the corresponding author.
单核苷酸多态性(SNPs)提供了有价值的人类进化历史信息,并可能有助于我们确定导致人类复杂疾病的遗传变异。不幸的是,分子单体型分析方法昂贵、费力且耗时;因此,通过计算方法构建完整单体型模式的算法,即标签单核苷酸多态性选择问题,是一种方便且有吸引力的方法。这个问题已被证明是 NP 难问题,因此启发式方法可能是有用的。在本文中,我们提出了一种基于遗传算法的启发式方法,以便在可接受的时间内找到合理的解决方案。该算法已在各种模拟和实验数据上进行了测试。与基于暴力破解的精确算法相比,结果表明,我们的方法几乎可以在所有情况下获得最优解,并且在 SNP 位置数量较大时,运行速度比精确算法快得多。我们的软件可以向相应的作者请求获得。