Suppr超能文献

PEAR:一种快速而准确的 Illumina 双端读取合并器。

PEAR: a fast and accurate Illumina Paired-End reAd mergeR.

机构信息

The Exelixis Lab, Scientific Computing Group, Heidelberg Institute for Theoretical Studies, Schloss-Wolfsbrunnenweg 35, D-69118 Heidelberg, Graduate School for Computing in Medicine and Life Sciences, Institut für Neuro- und Bioinformatik, University of Lübeck, 23538 Lübeck and Karlsruhe Institute of Technology, Institute for Theoretical Informatics, Postfach 6980, 76128 Karlsruhe, Germany.

出版信息

Bioinformatics. 2014 Mar 1;30(5):614-20. doi: 10.1093/bioinformatics/btt593. Epub 2013 Oct 18.

Abstract

MOTIVATION

The Illumina paired-end sequencing technology can generate reads from both ends of target DNA fragments, which can subsequently be merged to increase the overall read length. There already exist tools for merging these paired-end reads when the target fragments are equally long. However, when fragment lengths vary and, in particular, when either the fragment size is shorter than a single-end read, or longer than twice the size of a single-end read, most state-of-the-art mergers fail to generate reliable results. Therefore, a robust tool is needed to merge paired-end reads that exhibit varying overlap lengths because of varying target fragment lengths.

RESULTS

We present the PEAR software for merging raw Illumina paired-end reads from target fragments of varying length. The program evaluates all possible paired-end read overlaps and does not require the target fragment size as input. It also implements a statistical test for minimizing false-positive results. Tests on simulated and empirical data show that PEAR consistently generates highly accurate merged paired-end reads. A highly optimized implementation allows for merging millions of paired-end reads within a few minutes on a standard desktop computer. On multi-core architectures, the parallel version of PEAR shows linear speedups compared with the sequential version of PEAR.

AVAILABILITY AND IMPLEMENTATION

PEAR is implemented in C and uses POSIX threads. It is freely available at http://www.exelixis-lab.org/web/software/pear.

摘要

动机

Illumina 配对末端测序技术可以从目标 DNA 片段的两端生成读取序列,随后可以将这些读取序列进行合并以增加整体读取长度。当目标片段长度相同时,已经存在用于合并这些配对末端读取的工具。然而,当片段长度不同时,特别是当片段长度短于单末端读取或长于单末端读取的两倍时,大多数最先进的合并工具无法生成可靠的结果。因此,需要一种强大的工具来合并由于目标片段长度不同而具有不同重叠长度的配对末端读取。

结果

我们提出了 PEAR 软件,用于合并来自不同长度目标片段的原始 Illumina 配对末端读取。该程序评估所有可能的配对末端读取重叠,并且不需要目标片段大小作为输入。它还实现了一种用于最小化假阳性结果的统计测试。对模拟和经验数据的测试表明,PEAR 始终能够生成高度准确的合并配对末端读取。高度优化的实现允许在标准台式计算机上在几分钟内合并数百万对配对末端读取。在多核架构上,PEAR 的并行版本与 PEAR 的顺序版本相比具有线性加速。

可用性和实现

PEAR 是用 C 语言编写的,使用 POSIX 线程。它可以在 http://www.exelixis-lab.org/web/software/pear 上免费获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54d8/3933873/ffba9b17d003/btt593f1p.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验