• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Lighter:无需计数即可实现快速且内存高效的测序错误校正。

Lighter: fast and memory-efficient sequencing error correction without counting.

作者信息

Song Li, Florea Liliana, Langmead Ben

出版信息

Genome Biol. 2014;15(11):509. doi: 10.1186/s13059-014-0509-9.

DOI:10.1186/s13059-014-0509-9
PMID:25398208
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4248469/
Abstract

Lighter is a fast, memory-efficient tool for correcting sequencing errors. Lighter avoids counting k-mers. Instead, it uses a pair of Bloom filters, one holding a sample of the input k-mers and the other holding k-mers likely to be correct. As long as the sampling fraction is adjusted in inverse proportion to the depth of sequencing, Bloom filter size can be held constant while maintaining near-constant accuracy. Lighter is parallelized, uses no secondary storage, and is both faster and more memory-efficient than competing approaches while achieving comparable accuracy.

摘要

Lighter是一种快速、内存高效的用于纠正测序错误的工具。Lighter避免对k-mer进行计数。相反,它使用一对布隆过滤器,一个保存输入k-mer的样本,另一个保存可能正确的k-mer。只要采样率与测序深度成反比进行调整,布隆过滤器的大小就可以保持不变,同时保持近乎恒定的准确性。Lighter进行了并行化处理,不使用二级存储,并且在实现可比准确性的同时,比其他竞争方法更快且内存效率更高。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ba/4248469/a540487dbaf7/13059_2014_509_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ba/4248469/359d102d4595/13059_2014_509_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ba/4248469/6ea820b0ad35/13059_2014_509_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ba/4248469/247b34d5b4d1/13059_2014_509_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ba/4248469/635d155ac256/13059_2014_509_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ba/4248469/de1e671c21ee/13059_2014_509_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ba/4248469/a540487dbaf7/13059_2014_509_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ba/4248469/359d102d4595/13059_2014_509_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ba/4248469/6ea820b0ad35/13059_2014_509_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ba/4248469/247b34d5b4d1/13059_2014_509_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ba/4248469/635d155ac256/13059_2014_509_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ba/4248469/de1e671c21ee/13059_2014_509_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ba/4248469/a540487dbaf7/13059_2014_509_Fig6_HTML.jpg

相似文献

1
Lighter: fast and memory-efficient sequencing error correction without counting.Lighter:无需计数即可实现快速且内存高效的测序错误校正。
Genome Biol. 2014;15(11):509. doi: 10.1186/s13059-014-0509-9.
2
Turtle: identifying frequent k-mers with cache-efficient algorithms.海龟:使用缓存高效算法识别频繁的 k-mer。
Bioinformatics. 2014 Jul 15;30(14):1950-7. doi: 10.1093/bioinformatics/btu132. Epub 2014 Mar 10.
3
LightAssembler: fast and memory-efficient assembly algorithm for high-throughput sequencing reads.LightAssembler:一种用于高通量测序reads 的快速且节省内存的组装算法。
Bioinformatics. 2016 Nov 1;32(21):3215-3223. doi: 10.1093/bioinformatics/btw470. Epub 2016 Jul 13.
4
Efficient counting of k-mers in DNA sequences using a bloom filter.使用布隆过滤器高效计数 DNA 序列中的 k-mer。
BMC Bioinformatics. 2011 Aug 10;12:333. doi: 10.1186/1471-2105-12-333.
5
QuorUM: An Error Corrector for Illumina Reads.QuorUM:Illumina测序读数的纠错工具
PLoS One. 2015 Jun 17;10(6):e0130821. doi: 10.1371/journal.pone.0130821. eCollection 2015.
6
Squeakr: an exact and approximate k-mer counting system.Squeakr:一种精确和近似的 k-mer 计数系统。
Bioinformatics. 2018 Feb 15;34(4):568-575. doi: 10.1093/bioinformatics/btx636.
7
Fast Approximation of Frequent -Mers and Applications to Metagenomics.频繁短序列模式的快速近似算法及其在宏基因组学中的应用
J Comput Biol. 2020 Apr;27(4):534-549. doi: 10.1089/cmb.2019.0314. Epub 2019 Dec 20.
8
A benchmark study of k-mer counting methods for high-throughput sequencing.用于高通量测序的 k-mer 计数方法的基准研究。
Gigascience. 2018 Dec 1;7(12):giy125. doi: 10.1093/gigascience/giy125.
9
AllSome Sequence Bloom Trees.所有一些序列布隆树。
J Comput Biol. 2018 May;25(5):467-479. doi: 10.1089/cmb.2017.0258. Epub 2018 Apr 5.
10
A general near-exact k-mer counting method with low memory consumption enables de novo assembly of 106× human sequence data in 2.7 hours.一种通用的、近精确的低内存消耗 k-mer 计数方法,可在 2.7 小时内完成 106×人类序列数据的从头组装。
Bioinformatics. 2020 Dec 30;36(Suppl_2):i625-i633. doi: 10.1093/bioinformatics/btaa890.

引用本文的文献

1
A chromosome-level genome assembly of the Hispid cotton rat (Sigmodon hispidus), a model for human pathogenic virus infections.棉鼠(Sigmodon hispidus)的染色体水平基因组组装,棉鼠是人类致病病毒感染的模型。
BMC Biol. 2025 Jul 18;23(1):217. doi: 10.1186/s12915-025-02316-6.
2
Molecular characterization of multidrug-resistant recovered from diarrheagenic children under 5 years from Mukuru Informal Settlement, Nairobi, Kenya, based on whole-genome sequencing analysis.基于全基因组测序分析,对从肯尼亚内罗毕穆库鲁非正式定居点5岁以下腹泻儿童中分离出的多重耐药菌进行分子特征分析。
Microbiol Spectr. 2025 Jun 3;13(6):e0142024. doi: 10.1128/spectrum.01420-24. Epub 2025 May 15.
3

本文引用的文献

1
KmerStream: streaming algorithms for k-mer abundance estimation.KmerStream:用于 k-mer 丰度估计的流算法。
Bioinformatics. 2014 Dec 15;30(24):3541-7. doi: 10.1093/bioinformatics/btu713. Epub 2014 Oct 28.
2
These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure.这些不是你要找的k-mer:使用概率数据结构进行高效在线k-mer计数。
PLoS One. 2014 Jul 25;9(7):e101271. doi: 10.1371/journal.pone.0101271. eCollection 2014.
3
BLESS: bloom filter-based error correction solution for high-throughput sequencing reads.
Identification of Novel Core and Accessory Virulence Patterns in Chronic Rhinosinusitis.
慢性鼻-鼻窦炎中新型核心和辅助毒力模式的鉴定
Int J Mol Sci. 2025 Apr 14;26(8):3711. doi: 10.3390/ijms26083711.
4
Comparative genomic analysis of emerging non-typeable (NTHi) causing emerging septic arthritis in Atlanta.亚特兰大地区引起新发脓毒性关节炎的新发不可分型流感嗜血杆菌(NTHi)的比较基因组分析。
PeerJ. 2025 Mar 21;13:e19081. doi: 10.7717/peerj.19081. eCollection 2025.
5
RSYD-BASIC: a bioinformatic pipeline for routine sequence analysis and data processing of bacterial isolates for clinical microbiology.RSYD-BASIC:一种用于临床微生物学细菌分离株常规序列分析和数据处理的生物信息学流程。
Access Microbiol. 2025 Mar 21;7(3). doi: 10.1099/acmi.0.000646.v6. eCollection 2025.
6
Genomic characterization of antimicrobial-resistance and virulence factors in Salmonella isolates obtained from pig farms in Antioquia, Colombia.从哥伦比亚安蒂奥基亚省养猪场分离出的沙门氏菌菌株中抗菌耐药性和毒力因子的基因组特征分析
PLoS Negl Trop Dis. 2025 Jan 31;19(1):e0012830. doi: 10.1371/journal.pntd.0012830. eCollection 2025 Jan.
7
Global prevalence of strains with recombinant genes (Rec-Mas) horizontally transferred from : two major types, dominant circulating clone 7 and MLST ST46 sequence type.从[未提及的来源]水平转移的具有重组基因(Rec-Mas)的菌株的全球流行情况:两种主要类型,占主导地位的流行克隆7和多位点序列分型ST46序列类型。
Microbiol Spectr. 2024 Oct 21;12(12):e0193524. doi: 10.1128/spectrum.01935-24.
8
The GEA pipeline for characterizing Escherichia coli and Salmonella genomes.用于表征大肠杆菌和沙门氏菌基因组的 GEA 管道。
Sci Rep. 2024 Jun 10;14(1):13257. doi: 10.1038/s41598-024-63832-z.
9
A survey of k-mer methods and applications in bioinformatics.生物信息学中k-mer方法及其应用综述。
Comput Struct Biotechnol J. 2024 May 21;23:2289-2303. doi: 10.1016/j.csbj.2024.05.025. eCollection 2024 Dec.
10
Metabolism of L-arabinose converges with virulence regulation to promote enteric pathogen fitness.L-阿拉伯糖的代谢与毒力调节趋同,以促进肠道病原体的适应性。
Nat Commun. 2024 May 25;15(1):4462. doi: 10.1038/s41467-024-48933-7.
BLESS:基于布隆过滤器的高通量测序读错误纠正解决方案。
Bioinformatics. 2014 May 15;30(10):1354-62. doi: 10.1093/bioinformatics/btu030. Epub 2014 Jan 21.
4
Informed and automated k-mer size selection for genome assembly.基于信息和自动化的基因组组装的 k-mer 大小选择。
Bioinformatics. 2014 Jan 1;30(1):31-7. doi: 10.1093/bioinformatics/btt310. Epub 2013 Jun 3.
5
SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.SOAPdenovo2:一种经验丰富的、内存效率高的短读长从头组装器。
Gigascience. 2012 Dec 27;1(1):18. doi: 10.1186/2047-217X-1-18.
6
QUAST: quality assessment tool for genome assemblies.QUAST:基因组组装质量评估工具。
Bioinformatics. 2013 Apr 15;29(8):1072-5. doi: 10.1093/bioinformatics/btt086. Epub 2013 Feb 19.
7
Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data.Musket:一种基于多阶段 k-mer 频谱的 Illumina 序列数据错误校正工具。
Bioinformatics. 2013 Feb 1;29(3):308-15. doi: 10.1093/bioinformatics/bts690. Epub 2012 Nov 29.
8
Compression of next-generation sequencing reads aided by highly efficient de novo assembly.高通量测序reads 的压缩辅助高效从头组装。
Nucleic Acids Res. 2012 Dec;40(22):e171. doi: 10.1093/nar/gks754. Epub 2012 Aug 16.
9
Scaling metagenome sequence assembly with probabilistic de Bruijn graphs.基于概率有向图的宏基因组序列组装规模化方法。
Proc Natl Acad Sci U S A. 2012 Aug 14;109(33):13272-7. doi: 10.1073/pnas.1121464109. Epub 2012 Jul 30.
10
Fast gapped-read alignment with Bowtie 2.快速缺口读对准与 Bowtie 2。
Nat Methods. 2012 Mar 4;9(4):357-9. doi: 10.1038/nmeth.1923.