• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Fulcrum: condensing redundant reads from high-throughput sequencing studies.Fulcrum:从高通量测序研究中浓缩冗余的读取。
Bioinformatics. 2012 May 15;28(10):1324-7. doi: 10.1093/bioinformatics/bts123. Epub 2012 Mar 13.
2
Subset selection of high-depth next generation sequencing reads for de novo genome assembly using MapReduce framework.使用MapReduce框架进行从头基因组组装时对高深度下一代测序读数的子集选择。
BMC Genomics. 2015;16 Suppl 12(Suppl 12):S9. doi: 10.1186/1471-2164-16-S12-S9. Epub 2015 Dec 9.
3
QuorUM: An Error Corrector for Illumina Reads.QuorUM:Illumina测序读数的纠错工具
PLoS One. 2015 Jun 17;10(6):e0130821. doi: 10.1371/journal.pone.0130821. eCollection 2015.
4
Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies.雷:同时组装来自多种高通量测序技术的读数。
J Comput Biol. 2010 Nov;17(11):1519-33. doi: 10.1089/cmb.2009.0238. Epub 2010 Oct 20.
5
Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads.单轮循环器:从短读长和长读长测序数据中解析细菌基因组组装结果
PLoS Comput Biol. 2017 Jun 8;13(6):e1005595. doi: 10.1371/journal.pcbi.1005595. eCollection 2017 Jun.
6
HGA: de novo genome assembly method for bacterial genomes using high coverage short sequencing reads.HGA:一种利用高覆盖度短测序读段进行细菌基因组从头组装的方法。
BMC Genomics. 2016 Mar 5;17:193. doi: 10.1186/s12864-016-2515-7.
7
NxTrim: optimized trimming of Illumina mate pair reads.NxTrim:优化的 Illumina 配对读取修剪。
Bioinformatics. 2015 Jun 15;31(12):2035-7. doi: 10.1093/bioinformatics/btv057. Epub 2015 Feb 5.
8
BLESS: bloom filter-based error correction solution for high-throughput sequencing reads.BLESS:基于布隆过滤器的高通量测序读错误纠正解决方案。
Bioinformatics. 2014 May 15;30(10):1354-62. doi: 10.1093/bioinformatics/btu030. Epub 2014 Jan 21.
9
NeatFreq: reference-free data reduction and coverage normalization for De Novo sequence assembly.NeatFreq:用于从头序列组装的无参考数据缩减和覆盖度归一化
BMC Bioinformatics. 2014 Nov 19;15(1):357. doi: 10.1186/s12859-014-0357-3.
10
ParDRe: faster parallel duplicated reads removal tool for sequencing studies.ParDRe:用于测序研究的更快的并行重复读数去除工具。
Bioinformatics. 2016 May 15;32(10):1562-4. doi: 10.1093/bioinformatics/btw038. Epub 2016 Jan 22.

引用本文的文献

1
Deduplication Improves Cost-Efficiency and Yields of Assembly and Binning of Shotgun Metagenomes in Microbiome Research.重复数据删除提高了微生物组研究中鸟枪法宏基因组组装和分箱的成本效益及产量。
Microbiol Spectr. 2023 Feb 6;11(2):e0428222. doi: 10.1128/spectrum.04282-22.
2
Fast-HBR: Fast hash based duplicate read remover.Fast-HBR:基于快速哈希的重复读取消除器。
Bioinformation. 2022 Jan 31;18(1):36-40. doi: 10.6026/97320630018036. eCollection 2022.
3
Liquid biopsy uncovers distinct patterns of DNA methylation and copy number changes in NSCLC patients with different EGFR-TKI resistant mutations.液体活检揭示了不同 EGFR-TKI 耐药突变的 NSCLC 患者中 DNA 甲基化和拷贝数变化的独特模式。
Sci Rep. 2021 Aug 12;11(1):16436. doi: 10.1038/s41598-021-95985-6.
4
Ultra-Deep Massive Parallel Sequencing of Plasma Cell-Free DNA Enables Large-Scale Profiling of Driver Mutations in Vietnamese Patients With Advanced Non-Small Cell Lung Cancer.血浆游离DNA的超深度大规模平行测序可对越南晚期非小细胞肺癌患者的驱动基因突变进行大规模分析。
Front Oncol. 2020 Aug 4;10:1351. doi: 10.3389/fonc.2020.01351. eCollection 2020.
5
NGSReadsTreatment - A Cuckoo Filter-based Tool for Removing Duplicate Reads in NGS Data.NGSReadsTreatment - 一种基于布谷鸟过滤器的工具,用于去除 NGS 数据中的重复读取。
Sci Rep. 2019 Aug 12;9(1):11681. doi: 10.1038/s41598-019-48242-w.
6
Identification of factors associated with duplicate rate in ChIP-seq data.鉴定与 ChIP-seq 数据中重复率相关的因素。
PLoS One. 2019 Apr 3;14(4):e0214723. doi: 10.1371/journal.pone.0214723. eCollection 2019.
7
Conducting metagenomic studies in microbiology and clinical research.在微生物学和临床研究中进行宏基因组学研究。
Appl Microbiol Biotechnol. 2018 Oct;102(20):8629-8646. doi: 10.1007/s00253-018-9209-9. Epub 2018 Aug 4.
8
AmpUMI: design and analysis of unique molecular identifiers for deep amplicon sequencing.AmpUMI:用于深度扩增子测序的独特分子标识符的设计与分析。
Bioinformatics. 2018 Jul 1;34(13):i202-i210. doi: 10.1093/bioinformatics/bty264.
9
miR-181a-5p Regulates TNF-α and miR-21a-5p Influences Gualynate-Binding Protein 5 and IL-10 Expression in Macrophages Affecting Host Control of Infection.miR-181a-5p调节肿瘤坏死因子-α,miR-21a-5p影响巨噬细胞中瓜氨酸结合蛋白5和白细胞介素-10的表达,从而影响宿主对感染的控制。
Front Immunol. 2018 Jun 11;9:1331. doi: 10.3389/fimmu.2018.01331. eCollection 2018.
10
Effect of method of deduplication on estimation of differential gene expression using RNA-seq.重复数据去除方法对使用RNA测序估计差异基因表达的影响。
PeerJ. 2017 Mar 16;5:e3091. doi: 10.7717/peerj.3091. eCollection 2017.

本文引用的文献

1
Targeted sequencing of the human X chromosome exome.靶向人类 X 染色体外显子组测序。
Genomics. 2011 Oct;98(4):260-5. doi: 10.1016/j.ygeno.2011.04.004. Epub 2011 Apr 16.
2
Quake: quality-aware detection and correction of sequencing errors.Quake:测序错误的质量感知检测和校正。
Genome Biol. 2010;11(11):R116. doi: 10.1186/gb-2010-11-11-r116. Epub 2010 Nov 29.
3
Assembly of large genomes using second-generation sequencing.使用第二代测序技术进行大基因组组装。
Genome Res. 2010 Sep;20(9):1165-73. doi: 10.1101/gr.101360.109. Epub 2010 May 27.
4
Correction of sequencing errors in a mixed set of reads.纠正混合读取集中的测序错误。
Bioinformatics. 2010 May 15;26(10):1284-90. doi: 10.1093/bioinformatics/btq151. Epub 2010 Apr 8.
5
Parallel, tag-directed assembly of locally derived short sequence reads.并行、标签导向的局部衍生短序列读取组装。
Nat Methods. 2010 Feb;7(2):119-22. doi: 10.1038/nmeth.1416. Epub 2010 Jan 17.
6
The sequence and de novo assembly of the giant panda genome.大熊猫基因组的序列与从头组装。
Nature. 2010 Jan 21;463(7279):311-7. doi: 10.1038/nature08696. Epub 2009 Dec 13.
7
Sense from sequence reads: methods for alignment and assembly.从序列读取中获取意义:比对和组装方法
Nat Methods. 2009 Nov;6(11 Suppl):S6-S12. doi: 10.1038/nmeth.1376.
8
SHREC: a short-read error correction method.SHREC:一种短读长错误校正方法。
Bioinformatics. 2009 Sep 1;25(17):2157-63. doi: 10.1093/bioinformatics/btp379. Epub 2009 Jun 19.
9
Generation and analysis of transcriptomic resources for a model system on the rise: the sea anemone Aiptasia pallida and its dinoflagellate endosymbiont.新兴模式系统的转录组资源生成与分析:海葵苍白艾氏海葵及其甲藻内共生体
BMC Genomics. 2009 Jun 5;10:258. doi: 10.1186/1471-2164-10-258.
10
Efficient frequency-based de novo short-read clustering for error trimming in next-generation sequencing.用于下一代测序中错误校正的基于频率的高效从头短读聚类
Genome Res. 2009 Jul;19(7):1309-15. doi: 10.1101/gr.089151.108. Epub 2009 May 13.

Fulcrum:从高通量测序研究中浓缩冗余的读取。

Fulcrum: condensing redundant reads from high-throughput sequencing studies.

机构信息

Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5120, USA.

出版信息

Bioinformatics. 2012 May 15;28(10):1324-7. doi: 10.1093/bioinformatics/bts123. Epub 2012 Mar 13.

DOI:10.1093/bioinformatics/bts123
PMID:22419786
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3348557/
Abstract

MOTIVATION

Ultra-high-throughput sequencing produces duplicate and near-duplicate reads, which can consume computational resources in downstream applications. A tool that collapses such reads should reduce storage and assembly complications and costs.

RESULTS

We developed Fulcrum to collapse identical and near-identical Illumina and 454 reads (such as those from PCR clones) into single error-corrected sequences; it can process paired-end as well as single-end reads. Fulcrum is customizable and can be deployed on a single machine, a local network or a commercially available MapReduce cluster, and it has been optimized to maximize ease-of-use, cross-platform compatibility and future scalability. Sequence datasets have been collapsed by up to 71%, and the reduced number and improved quality of the resulting sequences allow assemblers to produce longer contigs while using less memory.

摘要

动机

超高通量测序会产生重复和近似重复的读取,这会消耗下游应用程序的计算资源。能够合并这些读取的工具可以减少存储和组装的复杂性和成本。

结果

我们开发了 Fulcrum 来将 Illumina 和 454 产生的相同和近似相同的读取(例如 PCR 克隆的读取)合并成单个纠错序列;它可以处理配对端和单端读取。Fulcrum 是可定制的,可以部署在单台机器、本地网络或商用的 MapReduce 集群上,并且已经针对易用性、跨平台兼容性和未来的可扩展性进行了优化。序列数据集的合并率最高可达 71%,并且减少的数量和提高的质量使得组装器可以在使用较少内存的情况下生成更长的连续序列。