• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

西格莫尼:使用压缩全基因组索引对纳米孔信号进行分类。

Sigmoni: classification of nanopore signal with a compressed pangenome index.

作者信息

Shivakumar Vikram S, Ahmed Omar Y, Kovaka Sam, Zakeri Mohsen, Langmead Ben

机构信息

Department of Computer Science, Johns Hopkins University.

出版信息

bioRxiv. 2023 Aug 30:2023.08.15.553308. doi: 10.1101/2023.08.15.553308.

DOI:10.1101/2023.08.15.553308
PMID:37645873
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10462034/
Abstract

Improvements in nanopore sequencing necessitate efficient classification methods, including pre-filtering and adaptive sampling algorithms that enrich for reads of interest. Signal-based approaches circumvent the computational bottleneck of basecalling. But past methods for signal-based classification do not scale efficiently to large, repetitive references like pangenomes, limiting their utility to partial references or individual genomes. We introduce Sigmoni: a rapid, multiclass classification method based on the -index that scales to references of hundreds of Gbps. Sigmoni quantizes nanopore signal into a discrete alphabet of picoamp ranges. It performs rapid, approximate matching using matching statistics, classifying reads based on distributions of picoamp matching statistics and co-linearity statistics. Sigmoni is 10-100× faster than previous methods for adaptive sampling in host depletion experiments with improved accuracy, and can query reads against large microbial or human pangenomes.

摘要

纳米孔测序技术的改进需要高效的分类方法,包括预过滤和自适应采样算法,以富集感兴趣的 reads。基于信号的方法规避了碱基识别的计算瓶颈。但是,过去基于信号的分类方法无法有效地扩展到像泛基因组这样的大型重复参考序列,限制了它们在部分参考序列或单个基因组中的应用。我们引入了 Sigmoni:一种基于 -索引的快速多类分类方法,可扩展到数百 Gbps 的参考序列。Sigmoni 将纳米孔信号量化为皮安范围的离散字母表。它使用匹配统计进行快速近似匹配,根据皮安匹配统计和共线性统计的分布对 reads 进行分类。在宿主耗尽实验中,Sigmoni 比以前的自适应采样方法快 10-100 倍,准确性更高,并且可以针对大型微生物或人类泛基因组查询 reads。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c5a0/10472758/0180ea3b82f7/nihpp-2023.08.15.553308v2-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c5a0/10472758/d6a27fb32642/nihpp-2023.08.15.553308v2-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c5a0/10472758/25ecf604082c/nihpp-2023.08.15.553308v2-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c5a0/10472758/0180ea3b82f7/nihpp-2023.08.15.553308v2-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c5a0/10472758/d6a27fb32642/nihpp-2023.08.15.553308v2-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c5a0/10472758/25ecf604082c/nihpp-2023.08.15.553308v2-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c5a0/10472758/0180ea3b82f7/nihpp-2023.08.15.553308v2-f0003.jpg

相似文献

1
Sigmoni: classification of nanopore signal with a compressed pangenome index.西格莫尼:使用压缩全基因组索引对纳米孔信号进行分类。
bioRxiv. 2023 Aug 30:2023.08.15.553308. doi: 10.1101/2023.08.15.553308.
2
Sigmoni: classification of nanopore signal with a compressed pangenome index.西格蒙尼:使用压缩泛基因组索引对纳米孔信号进行分类。
Bioinformatics. 2024 Jun 28;40(Suppl 1):i287-i296. doi: 10.1093/bioinformatics/btae213.
3
Movi: a fast and cache-efficient full-text pangenome index.Movi:一种快速且缓存高效的全基因组索引。
bioRxiv. 2024 Feb 15:2023.11.04.565615. doi: 10.1101/2023.11.04.565615.
4
SPUMONI 2: improved classification using a pangenome index of minimizer digests.SPUMONI 2:使用最小化消化物的泛基因组指数进行改进分类。
Genome Biol. 2023 May 18;24(1):122. doi: 10.1186/s13059-023-02958-1.
5
Identifying and correcting repeat-calling errors in nanopore sequencing of telomeres.鉴定和纠正端粒纳米孔测序中的重复调用错误。
Genome Biol. 2022 Aug 26;23(1):180. doi: 10.1186/s13059-022-02751-6.
6
Sketching and sampling approaches for fast and accurate long read classification.快速准确的长读分类的草图和采样方法。
BMC Bioinformatics. 2022 Oct 31;23(1):452. doi: 10.1186/s12859-022-05014-0.
7
Accelerated nanopore basecalling with SLOW5 data format.基于 SLOW5 数据格式的快速纳米孔碱基调用。
Bioinformatics. 2023 Jun 1;39(6). doi: 10.1093/bioinformatics/btad352.
8
Performance of neural network basecalling tools for Oxford Nanopore sequencing.基于神经网络的牛津纳米孔测序碱基调用工具的性能。
Genome Biol. 2019 Jun 24;20(1):129. doi: 10.1186/s13059-019-1727-y.
9
Pan-genomic matching statistics for targeted nanopore sequencing.靶向纳米孔测序的泛基因组匹配统计
iScience. 2021 Jun 8;24(6):102696. doi: 10.1016/j.isci.2021.102696. eCollection 2021 Jun 25.
10
Combined nanopore adaptive sequencing and enzyme-based host depletion efficiently enriched microbial sequences and identified missing respiratory pathogens.联合纳米孔自适应测序和基于酶的宿主耗竭有效地富集了微生物序列,并鉴定了缺失的呼吸道病原体。
BMC Genomics. 2021 Oct 9;22(1):732. doi: 10.1186/s12864-021-08023-0.

本文引用的文献

1
Simulation of nanopore sequencing signal data with tunable parameters.可调参数的纳米孔测序信号数据模拟。
Genome Res. 2024 Jun 25;34(5):778-783. doi: 10.1101/gr.278730.123.
2
RawHash: enabling fast and accurate real-time analysis of raw nanopore signals for large genomes.RawHash:实现对大型基因组原始纳米孔信号的快速、准确实时分析。
Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i297-i307. doi: 10.1093/bioinformatics/btad272.
3
SPUMONI 2: improved classification using a pangenome index of minimizer digests.SPUMONI 2:使用最小化消化物的泛基因组指数进行改进分类。
Genome Biol. 2023 May 18;24(1):122. doi: 10.1186/s13059-023-02958-1.
4
A draft human pangenome reference.人类泛基因组参考草图。
Nature. 2023 May;617(7960):312-324. doi: 10.1038/s41586-023-05896-x. Epub 2023 May 10.
5
Rapid Real-time Squiggle Classification for Read until using RawMap.使用RawMap读取时的快速实时波形分类
Arch Clin Biomed Res. 2023;7(1):45-57. doi: 10.26502/acbr.50170318. Epub 2023 Jan 28.
6
From molecules to genomic variations: Accelerating genome analysis via intelligent algorithms and architectures.从分子到基因组变异:通过智能算法和架构加速基因组分析
Comput Struct Biotechnol J. 2022 Aug 18;20:4579-4599. doi: 10.1016/j.csbj.2022.08.019. eCollection 2022.
7
MONI: A Pangenomic Index for Finding Maximal Exact Matches.MONI:用于寻找最大精确匹配的泛基因组索引。
J Comput Biol. 2022 Feb;29(2):169-187. doi: 10.1089/cmb.2021.0290. Epub 2022 Jan 17.
8
Database resources of the national center for biotechnology information.国家生物技术信息中心数据库资源。
Nucleic Acids Res. 2022 Jan 7;50(D1):D20-D26. doi: 10.1093/nar/gkab1112.
9
PHONI: Streamed Matching Statistics with Multi-Genome References.PHONI:多基因组参考的流式匹配统计
Proc Data Compress Conf. 2021 Mar;2021:193-202. doi: 10.1109/dcc50243.2021.00027. Epub 2021 May 10.
10
SquiggleNet: real-time, direct classification of nanopore signals.SquiggleNet:实时直接对纳米孔信号进行分类。
Genome Biol. 2021 Oct 27;22(1):298. doi: 10.1186/s13059-021-02511-y.