• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

同步寡聚体在选择生物序列中的保守寡聚体方面比最小寡聚体更敏感。

Syncmers are more sensitive than minimizers for selecting conserved ‑mers in biological sequences.

作者信息

Edgar Robert

机构信息

None, Corte Madera, CA, USA.

出版信息

PeerJ. 2021 Feb 5;9:e10805. doi: 10.7717/peerj.10805. eCollection 2021.

DOI:10.7717/peerj.10805
PMID:33604186
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7869670/
Abstract

Minimizers are widely used to select subsets of fixed-length substrings (-mers) from biological sequences in applications ranging from read mapping to taxonomy prediction and indexing of large datasets. The minimizer of a string of consecutive -mers is the -mer with smallest value according to an ordering of all -mers. Syncmers are defined here as a family of alternative methods which select -mers by inspecting the position of the smallest-valued substring of length < within the -mer. For example, a closed syncmer is selected if its smallest -mer is at the start or end of the -mer. At least one closed syncmer must be found in every window of length ( - ) -mers. Unlike a minimizer, a syncmer is identified by its sequence alone, and is therefore synchronized in the following sense: if a given -mer is selected from one sequence, it will also be selected from any other sequence. Also, minimizers can be deleted by mutations in flanking sequence, which cannot happen with syncmers. Experiments on minimizers with parameters used in the minimap2 read mapper and Kraken taxonomy prediction algorithm respectively show that syncmers can simultaneously achieve both lower density and higher conservation compared to minimizers.

摘要

最小化器在从读取映射到分类预测以及大型数据集索引等各种应用中,被广泛用于从生物序列中选择固定长度子串(k-mers)的子集。一串连续k-mers的最小化器是根据所有k-mers的排序具有最小值的k-mer。同步k-mer在此被定义为一类替代方法,它通过检查长度为l < k的最小价值子串在k-mer内的位置来选择k-mers。例如,如果其最小k-mer在k-mer的开头或结尾,则选择一个封闭同步k-mer。在每个长度为(k - l)个k-mer的窗口中必须至少找到一个封闭同步k-mer。与最小化器不同,同步k-mer仅由其序列识别,因此在以下意义上是同步的:如果从一个序列中选择了给定的k-mer,那么它也将从任何其他序列中被选择。此外,最小化器可能会因侧翼序列中的突变而被删除,而同步k-mer不会出现这种情况。分别使用minimap2读取映射器和Kraken分类预测算法中使用的参数对最小化器进行的实验表明,与最小化器相比,同步k-mer可以同时实现更低的密度和更高的保守性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c15/7869670/ba60c794df17/peerj-09-10805-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c15/7869670/c398c577328c/peerj-09-10805-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c15/7869670/d092cecb6fc7/peerj-09-10805-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c15/7869670/ba60c794df17/peerj-09-10805-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c15/7869670/c398c577328c/peerj-09-10805-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c15/7869670/d092cecb6fc7/peerj-09-10805-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c15/7869670/ba60c794df17/peerj-09-10805-g003.jpg

相似文献

1
Syncmers are more sensitive than minimizers for selecting conserved ‑mers in biological sequences.同步寡聚体在选择生物序列中的保守寡聚体方面比最小寡聚体更敏感。
PeerJ. 2021 Feb 5;9:e10805. doi: 10.7717/peerj.10805. eCollection 2021.
2
Density and Conservation Optimization of the Generalized Masked-Minimizer Sketching Scheme.广义掩蔽最小化草图方案的密度和守恒优化。
J Comput Biol. 2024 Jan;31(1):2-20. doi: 10.1089/cmb.2023.0212. Epub 2023 Nov 17.
3
Theory of local k-mer selection with applications to long-read alignment.基于局部 k-mer 选择的理论及其在长读测序比对中的应用。
Bioinformatics. 2022 Oct 14;38(20):4659-4669. doi: 10.1093/bioinformatics/btab790.
4
Improved design and analysis of practical minimizers.实用极小化器的改进设计与分析。
Bioinformatics. 2020 Jul 1;36(Suppl_1):i119-i127. doi: 10.1093/bioinformatics/btaa472.
5
Efficient minimizer orders for large values of using minimum decycling sets.利用最小去环集对大 值 进行有效最小化排序。
Genome Res. 2023 Jul;33(7):1154-1161. doi: 10.1101/gr.277644.123. Epub 2023 Aug 9.
6
Weighted minimizer sampling improves long read mapping.加权最小化抽样提高长读测序数据的比对。
Bioinformatics. 2020 Jul 1;36(Suppl_1):i111-i118. doi: 10.1093/bioinformatics/btaa435.
7
Parameterized syncmer schemes improve long-read mapping.参数化同步mers 方案提高了长读测序数据的比对效率。
PLoS Comput Biol. 2022 Oct 28;18(10):e1010638. doi: 10.1371/journal.pcbi.1010638. eCollection 2022 Oct.
8
Differentiable Learning of Sequence-Specific Minimizer Schemes with DeepMinimizer.使用 DeepMinimizer 进行序列特异性最小化方案的可微学习。
J Comput Biol. 2022 Dec;29(12):1288-1304. doi: 10.1089/cmb.2022.0275. Epub 2022 Sep 12.
9
Improving the performance of minimizers and winnowing schemes.提高最小化器和淘汰方案的性能。
Bioinformatics. 2017 Jul 15;33(14):i110-i117. doi: 10.1093/bioinformatics/btx235.
10
On Minimizers and Convolutional Filters: Theoretical Connections and Applications to Genome Analysis.关于极小化器与卷积滤波器:理论联系及其在基因组分析中的应用
J Comput Biol. 2024 May;31(5):381-395. doi: 10.1089/cmb.2024.0483. Epub 2024 Apr 30.

引用本文的文献

1
Deep learning neural network development for the classification of bacteriocin sequences produced by lactic acid bacteria.用于乳酸菌产生的细菌素序列分类的深度学习神经网络开发
F1000Res. 2025 Jun 20;13:981. doi: 10.12688/f1000research.154432.2. eCollection 2024.
2
Oatk: a de novo assembly tool for complex plant organelle genomes.Oatk:一种用于复杂植物细胞器基因组的从头组装工具。
Genome Biol. 2025 Aug 7;26(1):235. doi: 10.1186/s13059-025-03676-6.
3
Efficient seeding for error-prone sequences with SubseqHash2.使用SubseqHash2对易错序列进行高效播种。

本文引用的文献

1
A Randomized Parallel Algorithm for Efficiently Finding Near-Optimal Universal Hitting Sets.一种用于高效找到近似最优通用命中集的随机并行算法。
Res Comput Mol Biol. 2020 May;12074:37-53. doi: 10.1007/978-3-030-45257-5_3. Epub 2020 Apr 21.
2
Improved design and analysis of practical minimizers.实用极小化器的改进设计与分析。
Bioinformatics. 2020 Jul 1;36(Suppl_1):i119-i127. doi: 10.1093/bioinformatics/btaa472.
3
Weighted minimizer sampling improves long read mapping.加权最小化抽样提高长读测序数据的比对。
Bioinformatics. 2025 Aug 2;41(8). doi: 10.1093/bioinformatics/btaf418.
4
ASVBM: Structural variant benchmarking with local joint analysis for multiple callsets.ASVBM:通过对多个数据集进行局部联合分析的结构变异基准测试
Comput Struct Biotechnol J. 2025 Jun 29;27:2851-2862. doi: 10.1016/j.csbj.2025.06.045. eCollection 2025.
5
GreedyMini: generating low-density DNA minimizers.GreedyMini:生成低密度DNA最小化子
Bioinformatics. 2025 Jul 1;41(Supplement_1):i275-i284. doi: 10.1093/bioinformatics/btaf251.
6
Fast and flexible minimizer digestion with digest.使用digest进行快速灵活的最小化酶切。
Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf368.
7
A survey of sequence-to-graph mapping algorithms in the pangenome era.泛基因组时代序列到图谱映射算法综述。
Genome Biol. 2025 May 22;26(1):138. doi: 10.1186/s13059-025-03606-6.
8
Verkko2 integrates proximity-ligation data with long-read De Bruijn graphs for efficient telomere-to-telomere genome assembly, phasing, and scaffolding.Verkko2将邻近连接数据与长读长德布鲁因图相结合,以实现高效的端粒到端粒基因组组装、定相和支架搭建。
Genome Res. 2025 Jun 12. doi: 10.1101/gr.280383.124.
9
De novo clustering of large long-read transcriptome datasets with isONclust3.使用isONclust3对大型长读长转录组数据集进行从头聚类。
Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf207.
10
The open-closed mod-minimizer algorithm.开闭模极小化算法。
Algorithms Mol Biol. 2025 Mar 17;20(1):4. doi: 10.1186/s13015-025-00270-0.
Bioinformatics. 2020 Jul 1;36(Suppl_1):i111-i118. doi: 10.1093/bioinformatics/btaa435.
4
Improved metagenomic analysis with Kraken 2.Kraken 2 提升宏基因组分析。
Genome Biol. 2019 Nov 28;20(1):257. doi: 10.1186/s13059-019-1891-0.
5
Asymptotically optimal minimizers schemes.渐近最优极小化方案。
Bioinformatics. 2018 Jul 1;34(13):i13-i22. doi: 10.1093/bioinformatics/bty258.
6
Minimap2: pairwise alignment for nucleotide sequences.Minimap2:核苷酸序列的两两比对。
Bioinformatics. 2018 Sep 15;34(18):3094-3100. doi: 10.1093/bioinformatics/bty191.
7
Designing small universal k-mer hitting sets for improved analysis of high-throughput sequencing.设计小型通用k-mer命中集以改进对高通量测序的分析
PLoS Comput Biol. 2017 Oct 2;13(10):e1005777. doi: 10.1371/journal.pcbi.1005777. eCollection 2017 Oct.
8
Next-generation sequencing: big data meets high performance computing.下一代测序:大数据邂逅高性能计算。
Drug Discov Today. 2017 Apr;22(4):712-717. doi: 10.1016/j.drudis.2017.01.014. Epub 2017 Feb 2.
9
Kraken: ultrafast metagenomic sequence classification using exact alignments.克拉肯:使用精确比对的超快速宏基因组序列分类
Genome Biol. 2014 Mar 3;15(3):R46. doi: 10.1186/gb-2014-15-3-r46.
10
Exploiting sparseness in de novo genome assembly.从头组装基因组中的稀疏性利用。
BMC Bioinformatics. 2012 Apr 19;13 Suppl 6(Suppl 6):S1. doi: 10.1186/1471-2105-13-S6-S1.