Pindel-TD:一种基于模式生长方法的串联重复检测器。

Pindel-TD: A Tandem Duplication Detector Based on A Pattern Growth Approach.

机构信息

School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.

Center for Mathematical Medical, the First Affiliated Hospital of Xi'an Jiaotong University, Xi'an 710061, China.

出版信息

Genomics Proteomics Bioinformatics. 2024 May 9;22(1). doi: 10.1093/gpbjnl/qzae008.

Abstract

Tandem duplication (TD) is a major type of structural variations (SVs) that plays an important role in novel gene formation and human diseases. However, TDs are often missed or incorrectly classified as insertions by most modern SV detection methods due to the lack of specialized operation on TD-related mutational signals. Herein, we developed a TD detection module for the Pindel tool, referred to as Pindel-TD, based on a TD-specific pattern growth approach. Pindel-TD is capable of detecting TDs with a wide size range at single nucleotide resolution. Using simulated and real read data from HG002, we demonstrated that Pindel-TD outperforms other leading methods in terms of precision, recall, F1-score, and robustness. Furthermore, by applying Pindel-TD to data generated from the K562 cancer cell line, we identified a TD located at the seventh exon of SAGE1, providing an explanation for its high expression. Pindel-TD is available for non-commercial use at https://github.com/xjtu-omics/pindel.

摘要

串联重复(TD)是一种主要的结构变异(SV)类型,它在新基因形成和人类疾病中起着重要作用。然而,由于缺乏针对 TD 相关突变信号的专门操作,大多数现代 SV 检测方法往往会错过或错误地将 TD 分类为插入。在此,我们基于 TD 特异性模式生长方法,为 Pindel 工具开发了一个 TD 检测模块,称为 Pindel-TD。Pindel-TD 能够以单核苷酸分辨率检测具有广泛大小范围的 TD。使用来自 HG002 的模拟和真实读取数据,我们证明 Pindel-TD 在精度、召回率、F1 得分和稳健性方面优于其他领先方法。此外,通过将 Pindel-TD 应用于来自 K562 癌细胞系的数据,我们鉴定出一个位于 SAGE1 第七外显子的 TD,为其高表达提供了解释。Pindel-TD 可在非商业用途上在 https://github.com/xjtu-omics/pindel 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9313/11425056/32d5fa914ba1/qzae008f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索