• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SeqKit2:一款用于序列和比对处理的瑞士军刀式工具。

SeqKit2: A Swiss army knife for sequence and alignment processing.

作者信息

Shen Wei, Sipos Botond, Zhao Liuyang

机构信息

Department of Infectious Diseases, Key Laboratory of Molecular Biology for Infectious Diseases (Ministry of Education), Institute for Viral Hepatitis The Second Affiliated Hospital of Chongqing Medical University Chongqing China.

European Molecular Biology Laboratory European Bioinformatics Institute Hinxton Cambridgeshire UK.

出版信息

Imeta. 2024 Apr 5;3(3):e191. doi: 10.1002/imt2.191. eCollection 2024 Jun.

DOI:10.1002/imt2.191
PMID:38898985
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11183193/
Abstract

In the era of ubiquitous high-throughput sequencing studies, there is a growing need for analysis tools that are not just performant but also comprehensive and user-friendly enough to cater to both novice and advanced users. This article introduces SeqKit2, the next iteration of the widely used sequence analysis tool SeqKit, featuring expanded functionality, performance optimizations, and support for additional compression methods. Retaining a pragmatic subcommand architecture, SeqKit2 represents substantial enhancement through the inclusion of 19 additional subcommands, expanding its overall repertoire to a total of 38 in eight categories. The new subcommands add functionality such as amplicon processing and robust, error-tolerant parsing of sequence records. In addition, three subcommands designed for real-time analysis are added for periodic monitoring of properties of FASTQ and Binary Alignment/Map alignment records and real-time streaming from multiple sequence files. The performance of SeqKit2 is benchmarked against the old version of SeqKit, Bioawk, Seqtk, and SeqFu tools. SeqKit2 consistently outperforms its predecessor, albeit with marginally higher memory usage, while maintaining competitive runtimes against other tools. With its broad functionality, proven usability, and ongoing development driven by user feedback, we hope that bioinformaticians will find SeqKit2 useful as a "Swiss army knife" of sequence and alignment processing-equally adept at facilitating ad hoc analyses and seamlessly integrating into larger pipelines.

摘要

在高通量测序研究无处不在的时代,人们越来越需要这样的分析工具:它们不仅性能卓越,而且足够全面且用户友好,能够满足新手和高级用户的需求。本文介绍了SeqKit2,它是广泛使用的序列分析工具SeqKit的下一代版本,具有扩展的功能、性能优化以及对其他压缩方法的支持。SeqKit2保留了实用的子命令架构,通过增加19个额外的子命令实现了大幅增强,使其在八个类别中的命令总数扩展到38个。这些新子命令增加了诸如扩增子处理以及对序列记录进行强大、容错解析等功能。此外,还添加了三个用于实时分析的子命令,用于定期监测FASTQ和二进制比对/映射比对记录的属性以及从多个序列文件进行实时流处理。将SeqKit2的性能与SeqKit的旧版本、Bioawk、Seqtk和SeqFu工具进行了基准测试。SeqKit2始终优于其前身,尽管内存使用略高,同时与其他工具相比保持有竞争力的运行时间。凭借其广泛的功能、经过验证的易用性以及由用户反馈驱动的持续开发,我们希望生物信息学家会发现SeqKit2作为序列和比对处理的“瑞士军刀”很有用——同样擅长于促进临时分析并无缝集成到更大的流程中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad7f/11183193/a56063aab5e6/IMT2-3-e191-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad7f/11183193/88eeee49c0eb/IMT2-3-e191-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad7f/11183193/a56063aab5e6/IMT2-3-e191-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad7f/11183193/88eeee49c0eb/IMT2-3-e191-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad7f/11183193/a56063aab5e6/IMT2-3-e191-g003.jpg

相似文献

1
SeqKit2: A Swiss army knife for sequence and alignment processing.SeqKit2:一款用于序列和比对处理的瑞士军刀式工具。
Imeta. 2024 Apr 5;3(3):e191. doi: 10.1002/imt2.191. eCollection 2024 Jun.
2
SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation.SeqKit:一个用于FASTA/Q文件操作的跨平台超快速工具包。
PLoS One. 2016 Oct 5;11(10):e0163962. doi: 10.1371/journal.pone.0163962. eCollection 2016.
3
SeqFu: A Suite of Utilities for the Robust and Reproducible Manipulation of Sequence Files.SeqFu:一套用于对序列文件进行稳健且可重复操作的实用工具。
Bioengineering (Basel). 2021 May 7;8(5):59. doi: 10.3390/bioengineering8050059.
4
Oculus: faster sequence alignment by streaming read compression.Oculus:通过流式读取压缩实现更快的序列比对。
BMC Bioinformatics. 2012 Nov 13;13:297. doi: 10.1186/1471-2105-13-297.
5
2FAST2Q: a general-purpose sequence search and counting program for FASTQ files.2FAST2Q:一个用于 FASTQ 文件的通用序列搜索和计数程序。
PeerJ. 2022 Oct 25;10:e14041. doi: 10.7717/peerj.14041. eCollection 2022.
6
FastqPuri: high-performance preprocessing of RNA-seq data.FastqPuri:RNA-seq 数据的高性能预处理。
BMC Bioinformatics. 2019 May 3;20(1):226. doi: 10.1186/s12859-019-2799-0.
7
High-throughput sequence alignment using Graphics Processing Units.使用图形处理单元进行高通量序列比对。
BMC Bioinformatics. 2007 Dec 10;8:474. doi: 10.1186/1471-2105-8-474.
8
CANEapp: a user-friendly application for automated next generation transcriptomic data analysis.CANEapp:一款用于自动化下一代转录组数据分析的用户友好型应用程序。
BMC Genomics. 2016 Jan 13;17:49. doi: 10.1186/s12864-015-2346-y.
9
Generation of artificial FASTQ files to evaluate the performance of next-generation sequencing pipelines.生成人工 FASTQ 文件以评估下一代测序管道的性能。
PLoS One. 2012;7(11):e49110. doi: 10.1371/journal.pone.0049110. Epub 2012 Nov 12.
10
ADEPT: a domain independent sequence alignment strategy for gpu architectures.ADEPT:一种适用于 GPU 架构的与领域无关的序列比对策略。
BMC Bioinformatics. 2020 Sep 15;21(1):406. doi: 10.1186/s12859-020-03720-1.

引用本文的文献

1
Efficient sequence alignment against millions of prokaryotic genomes with LexicMap.使用LexicMap与数百万个原核生物基因组进行高效序列比对。
Nat Biotechnol. 2025 Sep 10. doi: 10.1038/s41587-025-02812-8.
2
Augmenting cost-effectiveness in clinical diagnosis using extended whole-exome sequencing: SNVs, SVs, and beyond.使用扩展全外显子组测序提高临床诊断中的成本效益:单核苷酸变异、结构变异及其他。
J Hum Genet. 2025 Sep 8. doi: 10.1038/s10038-025-01403-4.
3
Evaluating the diagnostic capabilities of nanopore sequencing for detection in blacklegged ticks.

本文引用的文献

1
BigSeqKit: a parallel Big Data toolkit to process FASTA and FASTQ files at scale.BigSeqKit:一个用于大规模处理 FASTA 和 FASTQ 文件的并行大数据工具包。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad062. Epub 2023 Jul 31.
2
Essential Fitness Repertoire of Staphylococcus aureus during Co-infection with Acinetobacter baumannii .金黄色葡萄球菌与鲍曼不动杆菌共感染期间的基本适应能力.
mSystems. 2022 Oct 26;7(5):e0033822. doi: 10.1128/msystems.00338-22. Epub 2022 Aug 30.
3
Sustained software development, not number of citations or journal choice, is indicative of accurate bioinformatic software.
评估纳米孔测序技术在检测黑腿蜱方面的诊断能力。
bioRxiv. 2025 Aug 27:2025.08.26.672273. doi: 10.1101/2025.08.26.672273.
4
Gene Surfing: An efficient and versatile tool for targeted enzyme mining in metagenomics.基因冲浪:宏基因组学中用于靶向酶挖掘的一种高效且通用的工具。
Synth Syst Biotechnol. 2025 Jul 21;10(4):1377-1387. doi: 10.1016/j.synbio.2025.07.006. eCollection 2025 Dec.
5
DeepSEA: an alignment-free explainable approach to annotate antimicrobial resistance proteins.DeepSEA:一种用于注释抗微生物蛋白的无序列比对可解释方法。
BMC Bioinformatics. 2025 Sep 1;26(1):224. doi: 10.1186/s12859-025-06256-4.
6
MitoCOMON: whole mitochondrial DNA sequencing by primer design and long overlapping amplicon assembly.MitoCOMON:通过引物设计和长重叠扩增子组装进行全线粒体DNA测序
BMC Genomics. 2025 Aug 30;26(1):787. doi: 10.1186/s12864-025-12010-0.
7
Cell wall-related glycosyltransferases and wall architecture in the model liverwort Marchantia polymorpha.模式苔藓植物多歧苏铁细胞壁相关糖基转移酶与细胞壁结构
Plant J. 2025 Sep;123(5):e70439. doi: 10.1111/tpj.70439.
8
Role of the putative gene in normal germination of spores and virulence of the .假定基因在孢子正常萌发及该菌毒力中的作用。 (你提供的原文最后似乎不完整,这里是根据现有内容尽量完整翻译的)
Microb Cell. 2025 Aug 12;12:195-209. doi: 10.15698/mic2025.08.856. eCollection 2025.
9
sp. nov., a Novel Nitrate-Reducing Bacterium Isolated from Marine Sediments, and the Evolution of Nitrate-Reducing Genes in the Genus .sp. nov.,一种从海洋沉积物中分离出的新型硝酸盐还原细菌,以及该属中硝酸盐还原基因的进化 。
Microorganisms. 2025 Aug 13;13(8):1888. doi: 10.3390/microorganisms13081888.
10
Genetic basis for broad interspecific compatibility in Solanum verrucosum.马铃薯疣粒种广泛种间兼容性的遗传基础。
Plant J. 2025 Aug;123(4):e70426. doi: 10.1111/tpj.70426.
持续的软件开发,而不是引用数量或期刊选择,是准确生物信息学软件的指标。
Genome Biol. 2022 Feb 16;23(1):56. doi: 10.1186/s13059-022-02625-x.
4
SeqFu: A Suite of Utilities for the Robust and Reproducible Manipulation of Sequence Files.SeqFu:一套用于对序列文件进行稳健且可重复操作的实用工具。
Bioengineering (Basel). 2021 May 7;8(5):59. doi: 10.3390/bioengineering8050059.
5
Twelve years of SAMtools and BCFtools.SAMtools 和 BCFtools 十二年。
Gigascience. 2021 Feb 16;10(2). doi: 10.1093/gigascience/giab008.
6
Pyfastx: a robust Python package for fast random access to sequences from plain and gzipped FASTA/Q files.Pyfastx:一个强大的 Python 包,用于快速随机访问来自普通和 gzipped FASTA/Q 文件的序列。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa368.
7
SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation.SeqKit:一个用于FASTA/Q文件操作的跨平台超快速工具包。
PLoS One. 2016 Oct 5;11(10):e0163962. doi: 10.1371/journal.pone.0163962. eCollection 2016.
8
BioStar: an online question & answer resource for the bioinformatics community.BioStar:生物信息学社区的在线问答资源。
PLoS Comput Biol. 2011 Oct;7(10):e1002216. doi: 10.1371/journal.pcbi.1002216. Epub 2011 Oct 27.