下一代测序reads 的错误过滤、配对组装和纠错。

Error filtering, pair assembly and error correction for next-generation sequencing reads.

机构信息

Tiburon, CA 94920, USA and.

Department of Micro- and Nanotechnology, Technical University of Denmark, DK-2800 Lyngby, Denmark.

出版信息

Bioinformatics. 2015 Nov 1;31(21):3476-82. doi: 10.1093/bioinformatics/btv401. Epub 2015 Jul 2.

DOI:10.1093/bioinformatics/btv401

PMID:26139637

Abstract

MOTIVATION

Next-generation sequencing produces vast amounts of data with errors that are difficult to distinguish from true biological variation when coverage is low.

RESULTS

We demonstrate large reductions in error frequencies, especially for high-error-rate reads, by three independent means: (i) filtering reads according to their expected number of errors, (ii) assembling overlapping read pairs and (iii) for amplicon reads, by exploiting unique sequence abundances to perform error correction. We also show that most published paired read assemblers calculate incorrect posterior quality scores.

AVAILABILITY AND IMPLEMENTATION

These methods are implemented in the USEARCH package. Binaries are freely available at http://drive5.com/usearch.

CONTACT

robert@drive5.com

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

当覆盖率较低时，下一代测序会产生大量数据，其中的错误很难与真实的生物变异区分开来。

结果

我们通过三种独立的方法大大降低了错误频率，尤其是对于高错误率的读取：（i）根据预期错误数过滤读取，（ii）组装重叠的读取对，以及（iii）对于扩增子读取，利用独特的序列丰度进行错误纠正。我们还表明，大多数已发表的成对读取组装程序计算出不正确的后验质量评分。

可用性和实现

这些方法在 USEARCH 包中实现。二进制文件可在 http://drive5.com/usearch 上免费获得。

联系

robert@drive5.com

补充信息

补充数据可在 Bioinformatics 在线获得。

相似文献

Error filtering, pair assembly and error correction for next-generation sequencing reads.

Bioinformatics. 2015 Nov 1;31(21):3476-82. doi: 10.1093/bioinformatics/btv401. Epub 2015 Jul 2.

QuorUM: An Error Corrector for Illumina Reads.

PLoS One. 2015 Jun 17;10(6):e0130821. doi: 10.1371/journal.pone.0130821. eCollection 2015.

BLESS: bloom filter-based error correction solution for high-throughput sequencing reads.

Bioinformatics. 2014 May 15;30(10):1354-62. doi: 10.1093/bioinformatics/btu030. Epub 2014 Jan 21.

RECKONER: read error corrector based on KMC.

Bioinformatics. 2017 Apr 1;33(7):1086-1089. doi: 10.1093/bioinformatics/btw746.

Trowel: a fast and accurate error correction module for Illumina sequencing reads.

Bioinformatics. 2014 Nov 15;30(22):3264-5. doi: 10.1093/bioinformatics/btu513. Epub 2014 Jul 29.

Apollo: a sequencing-technology-independent, scalable and accurate assembly polishing algorithm.

Bioinformatics. 2020 Jun 1;36(12):3669-3679. doi: 10.1093/bioinformatics/btaa179.

RepLong: de novo repeat identification using long read sequencing data.

Bioinformatics. 2018 Apr 1;34(7):1099-1107. doi: 10.1093/bioinformatics/btx717.

CoLoRMap: Correcting Long Reads by Mapping short reads.

Bioinformatics. 2016 Sep 1;32(17):i545-i551. doi: 10.1093/bioinformatics/btw463.

An efficient error correction algorithm using FM-index.

BMC Bioinformatics. 2017 Nov 28;18(1):524. doi: 10.1186/s12859-017-1940-1.

Assembling short reads from jumping libraries with large insert sizes.

Bioinformatics. 2015 Oct 15;31(20):3262-8. doi: 10.1093/bioinformatics/btv337. Epub 2015 Jun 3.

引用本文的文献

Host Shaping Associated Microbiota in Hydrothermal Vent Snails from the Indian Ocean Ridge.

Biology (Basel). 2025 Jul 29;14(8):954. doi: 10.3390/biology14080954.

Totum-448 Improves MASLD and Modulates Microbiota in Hamsters: Dose-Response Study and Effects of Supplementation Cessation.

Food Sci Nutr. 2025 Sep 2;13(9):e70904. doi: 10.1002/fsn3.70904. eCollection 2025 Sep.

Bacterial microbiome and their assembly processing in two sympatric desert rodents ( and ) from different geographic sources.

Curr Zool. 2024 Oct 4;71(4):440-448. doi: 10.1093/cz/zoae062. eCollection 2025 Aug.

Archived natural DNA samplers reveal four decades of biodiversity change across the tree of life.

Nat Ecol Evol. 2025 Aug 1. doi: 10.1038/s41559-025-02812-6.

Metagenomic Insight into Cecal Microbiota Shifts in Broiler Chicks Following spp. Vaccination.

Microorganisms. 2025 Jun 24;13(7):1470. doi: 10.3390/microorganisms13071470.

Spatio-Temporal Variation in Diet Among Age and Sex Cohorts of a Model Generalist Bird Species, the Great Tit : New Insights Revealed by DNA Metabarcoding.

Ecol Evol. 2025 Jul 14;15(7):e71565. doi: 10.1002/ece3.71565. eCollection 2025 Jul.

Biomonitoring 2.0 Refined: observing local change through metaphylogeography using a community-based eDNA metabarcoding monitoring network.

BMC Biol. 2025 Jul 1;23(1):187. doi: 10.1186/s12915-025-02284-x.

Microbial dynamics across tri-trophic systems: insights from plant-herbivore-predator interactions.

FEMS Microbiol Ecol. 2025 Jun 24;101(7). doi: 10.1093/femsec/fiaf065.

Population Persistence and Soil Microbial Communities of a Serpentine Endemic Plant Outside Its Historic Elevation Range.

Ecol Evol. 2025 Jun 21;15(6):e71629. doi: 10.1002/ece3.71629. eCollection 2025 Jun.

A Comparison of the Effects of Milk, Yogurt, and Cheese on Insulin Sensitivity, Hepatic Steatosis, and Gut Microbiota in Diet-Induced Obese Male Mice.

Int J Mol Sci. 2025 May 23;26(11):5026. doi: 10.3390/ijms26115026.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

下一代测序reads 的错误过滤、配对组装和纠错。

Error filtering, pair assembly and error correction for next-generation sequencing reads.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

CONTACT

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

联系

补充信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献