Zhang Zhendong, Liu Yue, Li Xin, Liu Yadong, Wang Yadong, Jiang Tao
Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang, China.
Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan, China.
Front Genet. 2024 Jul 9;15:1435087. doi: 10.3389/fgene.2024.1435087. eCollection 2024.
Structural Variants (SVs) are a type of variation that can significantly influence phenotypes and cause diseases. Thus, the accurate detection of SVs is a vital part of modern genetic analysis. The advent of long-read sequencing technology ushers in a new era of more accurate and comprehensive SV calling, and many tools have been developed to call SVs using long-read data. Haplotype-tagging is a procedure that can tag haplotype information on reads and can thus potentially improve the SV detection; nevertheless, few methods make use of this information. In this article, we introduce HapKled, a new SV detection tool that can accurately detect SVs from Oxford Nanopore Technologies (ONT) long-read alignment data. HapKled utilizes haplotype information underlying alignment data by conducting haplotype-tagging using Whatshap on the reads to improve the detection performance, with three unique calling mechanics including altering clustering conditions according to haplotype information of signatures, determination of similar SVs based on haplotype information, and slack filtering conditions based on haplotype quality. In our evaluations, HapKled outperformed state-of-the-art tools and can deliver better SV detection results on both simulated and real sequencing data. The code and experiments of HapKled can be obtained from https://github.com/CoREse/HapKled. With the superb SV detection performance that HapKled can deliver, HapKled could be useful in bioinformatics research, clinical diagnosis, and medical research and development.
结构变异(SVs)是一种可显著影响表型并导致疾病的变异类型。因此,准确检测SVs是现代遗传分析的重要组成部分。长读长测序技术的出现开启了一个更准确、更全面的SVs检测新时代,并且已经开发了许多工具来使用长读长数据检测SVs。单倍型标签是一种可以在 reads 上标记单倍型信息的过程,因此有可能改善SVs检测;然而,很少有方法利用这些信息。在本文中,我们介绍了HapKled,一种新的SVs检测工具,它可以从牛津纳米孔技术(ONT)长读长比对数据中准确检测SVs。HapKled通过对reads使用Whatshap进行单倍型标签来利用比对数据中的单倍型信息,以提高检测性能,它有三种独特的检测机制,包括根据特征的单倍型信息改变聚类条件、基于单倍型信息确定相似的SVs以及基于单倍型质量放宽过滤条件。在我们的评估中,HapKled优于现有工具,并且在模拟和真实测序数据上都能提供更好的SVs检测结果。HapKled的代码和实验可以从https://github.com/CoREse/HapKled获取。凭借HapKled能够提供的卓越SVs检测性能,它可能在生物信息学研究、临床诊断以及医学研发中发挥作用。