Department of Computer Science and Engineering, Sejong University, Seoul 05006, Korea.
Cancer Research Institute, College of Medicine, The Catholic University of Korea, Seoul 06591, Korea.
Bioinformatics. 2018 Jun 1;34(11):1801-1807. doi: 10.1093/bioinformatics/bty012.
Single-individual haplotyping (SIH) is critical in genomic association studies and genetic diseases analysis. However, most genomic analysis studies do not perform haplotype-phasing analysis due to its complexity. Several computational methods have been developed to solve the SIH problem, but these approaches have not generated sufficiently reliable haplotypes.
Here, we propose a novel SIH algorithm, called PEATH (Probabilistic Evolutionary Algorithm with Toggling for Haplotyping), to achieve more accurate and reliable haplotyping. The proposed PEATH method was compared to the most recent algorithms in terms of the phased length, N50 length, switch error rate and minimum error correction. The PEATH algorithm consistently provides the best phase and N50 lengths, as long as possible, given datasets. In addition, verification of the simulation data demonstrated that the PEATH method outperforms other methods on high noisy data. Additionally, the experimental results of a real dataset confirmed that the PEATH method achieved comparable or better accuracy.
Source code of PEATH is available at https://github.com/jcna99/PEATH.
jkrhee@catholic.ac.kr or sooyong.shin@gmail.com.
Supplementary data are available at Bioinformatics online.
单个体系谱分析(SIH)在基因组关联研究和遗传疾病分析中至关重要。然而,由于其复杂性,大多数基因组分析研究都没有进行单倍型相位分析。已经开发了几种计算方法来解决 SIH 问题,但这些方法并没有产生足够可靠的单倍型。
在这里,我们提出了一种新的 SIH 算法,称为 PEATH(用于单倍型的概率进化算法与切换),以实现更准确和可靠的单倍型。所提出的 PEATH 方法在相位长度、N50 长度、切换错误率和最小错误校正方面与最近的算法进行了比较。只要有数据集,PEATH 算法就能始终提供最佳的相位和 N50 长度。此外,对模拟数据的验证表明,PEATH 方法在高噪声数据上的表现优于其他方法。此外,真实数据集的实验结果证实,PEATH 方法的准确性可与之媲美或更高。
PEATH 的源代码可在 https://github.com/jcna99/PEATH 上获得。
jkrhee@catholic.ac.kr 或 sooyong.shin@gmail.com。
补充数据可在生物信息学在线获得。