• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CASPER:一种用于高通量扩增子测序中双端reads 的上下文感知方案。

CASPER: context-aware scheme for paired-end reads from high-throughput amplicon sequencing.

出版信息

BMC Bioinformatics. 2014;15 Suppl 9(Suppl 9):S10. doi: 10.1186/1471-2105-15-S9-S10. Epub 2014 Sep 10.

DOI:10.1186/1471-2105-15-S9-S10
PMID:25252785
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4168710/
Abstract

Merging the forward and reverse reads from paired-end sequencing is a critical task that can significantly improve the performance of downstream tasks, such as genome assembly and mapping, by providing them with virtually elongated reads. However, due to the inherent limitations of most paired-end sequencers, the chance of observing erroneous bases grows rapidly as the end of a read is approached, which becomes a critical hurdle for accurately merging paired-end reads. Although there exist several sophisticated approaches to this problem, their performance in terms of quality of merging often remains unsatisfactory. To address this issue, here we present a context-aware scheme for paired-end reads (CASPER): a computational method to rapidly and robustly merge overlapping paired-end reads. Being particularly well suited to amplicon sequencing applications, CASPER is thoroughly tested with both simulated and real high-throughput amplicon sequencing data. According to our experimental results, CASPER significantly outperforms existing state-of-the art paired-end merging tools in terms of accuracy and robustness. CASPER also exploits the parallelism in the task of paired-end merging and effectively speeds up by multithreading. CASPER is freely available for academic use at http://best.snu.ac.kr/casper.

摘要

将配对末端测序的正向和反向读取合并是一项关键任务,通过为它们提供虚拟延长的读取,可以显著提高下游任务(如基因组组装和映射)的性能。然而,由于大多数配对末端测序仪的固有限制,随着读取结束,观察错误碱基的机会迅速增加,这成为准确合并配对末端读取的关键障碍。尽管存在几种复杂的方法来解决这个问题,但它们在合并质量方面的性能往往仍不尽如人意。为了解决这个问题,我们在这里提出了一种用于配对末端读取的上下文感知方案(CASPER):一种快速而稳健的重叠配对末端读取合并的计算方法。由于特别适合扩增子测序应用,CASPER 经过了模拟和真实高通量扩增子测序数据的全面测试。根据我们的实验结果,CASPER 在准确性和稳健性方面明显优于现有的最先进的配对末端合并工具。CASPER 还利用了配对末端合并任务中的并行性,并通过多线程有效地加速。CASPER 可在 http://best.snu.ac.kr/casper 上免费供学术使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/703b/4168710/ef7334a0d42d/1471-2105-15-S9-S10-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/703b/4168710/1ef6cd7ae8fa/1471-2105-15-S9-S10-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/703b/4168710/9af66dd319a0/1471-2105-15-S9-S10-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/703b/4168710/539e4b6ac821/1471-2105-15-S9-S10-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/703b/4168710/cfbf7b0c62ed/1471-2105-15-S9-S10-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/703b/4168710/c9d4b5ab0399/1471-2105-15-S9-S10-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/703b/4168710/b73efb1543c8/1471-2105-15-S9-S10-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/703b/4168710/ef7334a0d42d/1471-2105-15-S9-S10-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/703b/4168710/1ef6cd7ae8fa/1471-2105-15-S9-S10-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/703b/4168710/9af66dd319a0/1471-2105-15-S9-S10-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/703b/4168710/539e4b6ac821/1471-2105-15-S9-S10-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/703b/4168710/cfbf7b0c62ed/1471-2105-15-S9-S10-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/703b/4168710/c9d4b5ab0399/1471-2105-15-S9-S10-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/703b/4168710/b73efb1543c8/1471-2105-15-S9-S10-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/703b/4168710/ef7334a0d42d/1471-2105-15-S9-S10-7.jpg

相似文献

1
CASPER: context-aware scheme for paired-end reads from high-throughput amplicon sequencing.CASPER:一种用于高通量扩增子测序中双端reads 的上下文感知方案。
BMC Bioinformatics. 2014;15 Suppl 9(Suppl 9):S10. doi: 10.1186/1471-2105-15-S9-S10. Epub 2014 Sep 10.
2
MeFiT: merging and filtering tool for illumina paired-end reads for 16S rRNA amplicon sequencing.MeFiT:用于16S rRNA扩增子测序的Illumina双端读数的合并与过滤工具。
BMC Bioinformatics. 2016 Dec 1;17(1):491. doi: 10.1186/s12859-016-1358-1.
3
NGmerge: merging paired-end reads via novel empirically-derived models of sequencing errors.NGmerge:通过新型经验衍生的测序错误模型合并配对末端读取。
BMC Bioinformatics. 2018 Dec 20;19(1):536. doi: 10.1186/s12859-018-2579-2.
4
PEAR: a fast and accurate Illumina Paired-End reAd mergeR.PEAR:一种快速而准确的 Illumina 双端读取合并器。
Bioinformatics. 2014 Mar 1;30(5):614-20. doi: 10.1093/bioinformatics/btt593. Epub 2013 Oct 18.
5
BBMerge - Accurate paired shotgun read merging via overlap.BBMerge - 通过重叠实现准确的双端鸟枪法读段合并。
PLoS One. 2017 Oct 26;12(10):e0185056. doi: 10.1371/journal.pone.0185056. eCollection 2017.
6
CAREx: context-aware read extension of paired-end sequencing data.CAREx:基于上下文感知的配对末端测序数据扩展。
BMC Bioinformatics. 2024 May 10;25(1):186. doi: 10.1186/s12859-024-05802-w.
7
Subset selection of high-depth next generation sequencing reads for de novo genome assembly using MapReduce framework.使用MapReduce框架进行从头基因组组装时对高深度下一代测序读数的子集选择。
BMC Genomics. 2015;16 Suppl 12(Suppl 12):S9. doi: 10.1186/1471-2164-16-S12-S9. Epub 2015 Dec 9.
8
Benefits of merging paired-end reads before pre-processing environmental metagenomics data.在预处理环境宏基因组数据之前合并配对末端reads 的好处。
Mar Genomics. 2022 Feb;61:100914. doi: 10.1016/j.margen.2021.100914. Epub 2021 Dec 2.
9
Pseudo-Sanger sequencing: massively parallel production of long and near error-free reads using NGS technology.伪桑格测序:使用下一代测序(NGS)技术大规模并行产生长且近乎无错误的 reads。
BMC Genomics. 2013 Oct 17;14(1):711. doi: 10.1186/1471-2164-14-711.
10
Don't let valuable microbiome data go to waste: combined usage of merging and direct-joining of sequencing reads for low-quality paired-end amplicon data.不要让有价值的微生物组数据浪费掉:将合并和测序reads 的直接连接结合使用,以处理低质量的双端扩增子数据。
Biotechnol Lett. 2024 Oct;46(5):791-805. doi: 10.1007/s10529-024-03509-9. Epub 2024 Jul 6.

引用本文的文献

1
Comparison of extracellular vesicles carrying bacterial DNA in urine and serum from a Korean population.韩国人群尿液和血清中携带细菌DNA的细胞外囊泡的比较。
Front Microbiol. 2025 Aug 12;16:1616528. doi: 10.3389/fmicb.2025.1616528. eCollection 2025.
2
Age- and Sex-Specific Gut Microbiota Signatures Associated with Dementia-Related Brain Pathologies: An LEfSe-Based Metagenomic Study.与痴呆相关脑病理特征相关的年龄和性别特异性肠道微生物群特征:一项基于LEfSe的宏基因组学研究
Brain Sci. 2025 Jun 5;15(6):611. doi: 10.3390/brainsci15060611.
3
Refining microbiome diversity analysis by concatenating and integrating dual 16S rRNA amplicon reads.

本文引用的文献

1
In-depth analysis of interrelation between quality scores and real errors in Illumina reads.对Illumina测序读段中质量分数与实际错误之间的相互关系进行深入分析。
Annu Int Conf IEEE Eng Med Biol Soc. 2013;2013:635-8. doi: 10.1109/EMBC.2013.6609580.
2
DSK: k-mer counting with very low memory usage.DSK:使用极低内存进行 k-mer 计数。
Bioinformatics. 2013 Mar 1;29(5):652-3. doi: 10.1093/bioinformatics/btt020. Epub 2013 Jan 16.
3
COPE: an accurate k-mer-based pair-end reads connection tool to facilitate genome assembly.COPE:一种基于精确 k-mer 的双端 reads 连接工具,可方便基因组组装。
通过拼接和整合双16S rRNA扩增子读数来优化微生物组多样性分析。
NPJ Biofilms Microbiomes. 2025 Apr 12;11(1):57. doi: 10.1038/s41522-025-00686-x.
4
Robust phylogenetic tree-based microbiome association test using repeatedly measured data for composition bias.基于稳健系统发育树的微生物组关联测试,使用重复测量数据校正组成偏差。
BMC Bioinformatics. 2025 Mar 6;26(1):75. doi: 10.1186/s12859-024-06002-2.
5
Microbial Communities in Agave Fermentations Vary by Local Biogeographic Regions.龙舌兰发酵过程中的微生物群落因当地生物地理区域而异。
Environ Microbiol Rep. 2025 Feb;17(1):e70057. doi: 10.1111/1758-2229.70057.
6
Revealing the microbiome diversity and biocontrol potential of field Aedes ssp.: Implications for disease vector management.揭示田间埃及伊蚊的微生物组多样性和生物防治潜力:对病媒管理的启示。
PLoS One. 2024 Apr 29;19(4):e0302328. doi: 10.1371/journal.pone.0302328. eCollection 2024.
7
Efficacy of Quadruple-coated Probiotics in Patients With Irritable Bowel Syndrome: A Randomized, Double-blind, Placebo-controlled, Parallel-group Study.四联包被益生菌对肠易激综合征患者的疗效:一项随机、双盲、安慰剂对照、平行组研究。
J Neurogastroenterol Motil. 2024 Jan 30;30(1):73-86. doi: 10.5056/jnm23036.
8
A study of microbial diversity in a biofertilizer consortium.生物肥料联合体中的微生物多样性研究。
PLoS One. 2023 Aug 24;18(8):e0286285. doi: 10.1371/journal.pone.0286285. eCollection 2023.
9
Organic and inorganic nutrients modulate taxonomic diversity and trophic strategies of small eukaryotes in oligotrophic oceans.有机和无机营养物质调节着贫营养海洋中微小真核生物的分类多样性和营养策略。
FEMS Microbes. 2022 Dec 7;4:xtac029. doi: 10.1093/femsmc/xtac029. eCollection 2023.
10
Potential of Gut Microbe-Derived Extracellular Vesicles to Differentiate Inflammatory Bowel Disease Patients from Healthy Controls.肠道微生物衍生细胞外囊泡在区分炎症性肠病患者与健康对照中的潜力。
Gut Liver. 2023 Jan 15;17(1):108-118. doi: 10.5009/gnl220081. Epub 2022 Nov 25.
Bioinformatics. 2012 Nov 15;28(22):2870-4. doi: 10.1093/bioinformatics/bts563. Epub 2012 Oct 8.
4
GemSIM: general, error-model based simulator of next-generation sequencing data.GemSIM:新一代测序数据的通用、基于错误模型的模拟器。
BMC Genomics. 2012 Feb 15;13:74. doi: 10.1186/1471-2164-13-74.
5
PANDAseq: paired-end assembler for illumina sequences.PANDAseq:适用于 Illumina 序列的 paired-end 组装程序。
BMC Bioinformatics. 2012 Feb 14;13:31. doi: 10.1186/1471-2105-13-31.
6
FLASH: fast length adjustment of short reads to improve genome assemblies.FLASH:快速调整短读长以提高基因组组装质量。
Bioinformatics. 2011 Nov 1;27(21):2957-63. doi: 10.1093/bioinformatics/btr507. Epub 2011 Sep 7.
7
Next-generation transcriptome assembly.下一代转录组组装。
Nat Rev Genet. 2011 Sep 7;12(10):671-82. doi: 10.1038/nrg3068.
8
Efficient counting of k-mers in DNA sequences using a bloom filter.使用布隆过滤器高效计数 DNA 序列中的 k-mer。
BMC Bioinformatics. 2011 Aug 10;12:333. doi: 10.1186/1471-2105-12-333.
9
Generation of multimillion-sequence 16S rRNA gene libraries from complex microbial communities by assembling paired-end illumina reads.通过组装 Illumina 双端测序 reads 从复杂微生物群落中生成数百万个 16S rRNA 基因文库。
Appl Environ Microbiol. 2011 Jun;77(11):3846-52. doi: 10.1128/AEM.02772-10. Epub 2011 Apr 1.
10
Removing noise from pyrosequenced amplicons.从焦磷酸测序扩增子中去除噪声。
BMC Bioinformatics. 2011 Jan 28;12:38. doi: 10.1186/1471-2105-12-38.