Del Fabbro Cristian, Scalabrin Simone, Morgante Michele, Giorgi Federico M
Institute of Applied Genomics, Udine, Italy.
IGA Technology Services, Udine, Italy.
PLoS One. 2013 Dec 23;8(12):e85024. doi: 10.1371/journal.pone.0085024. eCollection 2013.
Next Generation Sequencing is having an extremely strong impact in biological and medical research and diagnostics, with applications ranging from gene expression quantification to genotyping and genome reconstruction. Sequencing data is often provided as raw reads which are processed prior to analysis 1 of the most used preprocessing procedures is read trimming, which aims at removing low quality portions while preserving the longest high quality part of a NGS read. In the current work, we evaluate nine different trimming algorithms in four datasets and three common NGS-based applications (RNA-Seq, SNP calling and genome assembly). Trimming is shown to increase the quality and reliability of the analysis, with concurrent gains in terms of execution time and computational resources needed.
新一代测序技术正在对生物学和医学研究及诊断产生极其强大的影响,其应用范围从基因表达定量到基因分型和基因组重建。测序数据通常以原始读数的形式提供,在分析之前需要进行处理。最常用的预处理程序之一是读段修剪,其目的是去除低质量部分,同时保留NGS读段中最长的高质量部分。在当前的工作中,我们在四个数据集和三个基于NGS的常见应用(RNA测序、单核苷酸多态性(SNP)检测和基因组组装)中评估了九种不同的修剪算法。结果表明,修剪可提高分析的质量和可靠性,同时在执行时间和所需计算资源方面也有相应的提升。