Ktrim：一款超快、超准的测序数据接头和质量修剪工具。

Ktrim: an extra-fast and accurate adapter- and quality-trimmer for sequencing data.

机构信息

Shenzhen Bay Laboratory, Shenzhen 518055, China.

出版信息

Bioinformatics. 2020 Jun 1;36(11):3561-3562. doi: 10.1093/bioinformatics/btaa171.

DOI:10.1093/bioinformatics/btaa171

PMID:32159761

Abstract

MOTIVATION

Next-generation sequencing (NGS) data frequently suffer from poor-quality cycles and adapter contaminations therefore need to be preprocessed before downstream analyses. With the ever-growing throughput and read length of modern sequencers, the preprocessing step turns to be a bottleneck in data analysis due to unmet performance of current tools. Extra-fast and accurate adapter- and quality-trimming tools for sequencing data preprocessing are therefore still of urgent demand.

RESULTS

Ktrim was developed in this work. Key features of Ktrim include: built-in support to adapters of common library preparation kits; supports user-supplied, customized adapter sequences; supports both paired-end and single-end data; supports parallelization to accelerate the analysis. Ktrim was ∼2-18 times faster than current tools and also showed high accuracy when applied on the testing datasets. Ktrim could thus serve as a valuable and efficient tool for short-read NGS data preprocessing.

AVAILABILITY AND IMPLEMENTATION

Source codes and scripts to reproduce the results descripted in this article are freely available at https://github.com/hellosunking/Ktrim/, distributed under the GPL v3 license.

CONTACT

sunkun@szbl.ac.cn.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

下一代测序（NGS）数据经常受到低质量循环和接头污染的影响，因此需要在下游分析之前进行预处理。随着现代测序仪的通量和读长的不断增长，由于当前工具的性能无法满足要求，预处理步骤成为数据分析的瓶颈。因此，针对测序数据预处理的超快速、准确的接头和质量修剪工具仍然是迫切需要的。

结果

在这项工作中开发了 Ktrim。Ktrim 的主要特点包括：内置对常见文库制备试剂盒的接头的支持；支持用户提供的、自定义的接头序列；支持配对端和单端数据；支持并行化以加速分析。Ktrim 的速度比当前的工具快 2-18 倍，并且在测试数据集上也表现出了很高的准确性。因此，Ktrim 可以作为一种用于短读长 NGS 数据预处理的有价值且高效的工具。