Computational Biology, PacBio, 1305 O'Brien Drive, Menlo Park, CA 94025, United States.
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae042.
In diploid organisms, phasing is the problem of assigning the alleles at heterozygous variants to one of two haplotypes. Reads from PacBio HiFi sequencing provide long, accurate observations that can be used as the basis for both calling and phasing variants. HiFi reads also excel at calling larger classes of variation, such as structural or tandem repeat variants. However, current phasing tools typically only phase small variants, leaving larger variants unphased.
We developed HiPhase, a tool that jointly phases SNVs, indels, structural, and tandem repeat variants. The main benefits of HiPhase are (i) dual mode allele assignment for detecting large variants, (ii) a novel application of the A*-algorithm to phasing, and (iii) logic allowing phase blocks to span breaks caused by alignment issues around reference gaps and homozygous deletions. In our assessment, HiPhase produced an average phase block NG50 of 480 kb with 929 switchflip errors and fully phased 93.8% of genes, improving over the current state of the art. Additionally, HiPhase jointly phases SNVs, indels, structural, and tandem repeat variants and includes innate multi-threading, statistics gathering, and concurrent phased alignment output generation.
HiPhase is available as source code and a pre-compiled Linux binary with a user guide at https://github.com/PacificBiosciences/HiPhase.
在二倍体生物中,相位问题是将杂合变体中的等位基因分配到两个单倍型之一的问题。PacBio HiFi 测序的读取提供了长而准确的观察结果,可用于变体调用和相位的基础。HiFi 读取还擅长调用更大类别的变异,例如结构或串联重复变异。然而,当前的相位工具通常只相位小变体,而较大的变体则未相位。
我们开发了 HiPhase,这是一种联合相位 SNV、插入缺失、结构和串联重复变体的工具。HiPhase 的主要优势在于 (i) 双模式等位基因分配用于检测大变体,(ii) A*-算法在相位中的新应用,以及 (iii) 允许相位块跨越由参考间隙和纯合缺失引起的对齐问题的断点的逻辑。在我们的评估中,HiPhase 产生了平均相位块 NG50 为 480kb,有 929 个切换翻转错误,完全相位了 93.8%的基因,优于当前的最新技术水平。此外,HiPhase 联合相位 SNV、插入缺失、结构和串联重复变体,并具有内置的多线程、统计信息收集和并发相位对齐输出生成。
HiPhase 可作为源代码和预编译的 Linux 二进制文件使用,并附有用户指南,可在 https://github.com/PacificBiosciences/HiPhase 上获得。