Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia.
Hum Mutat. 2021 Jun;42(6):777-786. doi: 10.1002/humu.24197. Epub 2021 Apr 1.
KATK is a fast and accurate software tool for calling variants directly from raw next-generation sequencing reads. It uses predefined k-mers to retrieve only the reads of interest from the FASTQ file and calls genotypes by aligning retrieved reads locally. KATK does not use data about known polymorphisms and has NC (no call) as the default genotype. The reference or variant allele is called only if there is sufficient evidence for their presence in data. Thus it is not biased against rare variants or de-novo mutations. With simulated datasets, we achieved a false-negative rate of 0.23% (sensitivity 99.77%) and a false discovery rate of 0.19%. Calling all human exonic regions with KATK requires 1-2 h, depending on sequencing coverage.
KATK 是一款快速、准确的软件工具,可直接从原始的下一代测序读取中调用变体。它使用预定义的 k- mers 从 FASTQ 文件中仅检索感兴趣的读取,并通过局部对齐检索的读取来调用基因型。KATK 不使用关于已知多态性的数据,并且 NC(无调用)是默认的基因型。只有在数据中存在足够的证据表明存在参考或变异等位基因时,才会调用它们。因此,它不会对罕见变体或从头突变产生偏见。在模拟数据集上,我们的假阴性率为 0.23%(灵敏度 99.77%),假阳性率为 0.19%。使用 KATK 调用所有人类外显子区域需要 1-2 小时,具体取决于测序覆盖率。