• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SequelTools:一套用于处理 PacBio Sequel 原始序列数据的工具。

SequelTools: a suite of tools for working with PacBio Sequel raw sequence data.

机构信息

Department of Ecology, Evolution and Organismal Biology, Iowa State University, Ames, IA, 50011, USA.

Virus and Prion Research Unit, National Animal Disease Center, USDA-ARS, Ames, IA, 50010, USA.

出版信息

BMC Bioinformatics. 2020 Oct 1;21(1):429. doi: 10.1186/s12859-020-03751-8.

DOI:10.1186/s12859-020-03751-8
PMID:33004007
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7532105/
Abstract

BACKGROUND

PacBio sequencing is an incredibly valuable third-generation DNA sequencing method due to very long read lengths, ability to detect methylated bases, and its real-time sequencing methodology. Yet, hitherto no tool was available for analyzing the quality of, subsampling, and filtering PacBio data.

RESULTS

Here we present SequelTools, a command-line program containing three tools: Quality Control, Read Subsampling, and Read Filtering. The Quality Control tool quickly processes PacBio Sequel raw sequence data from multiple SMRTcells producing multiple statistics and publication-quality plots describing the quality of the data including N50, read length and count statistics, PSR, and ZOR. The Read Subsampling tool allows the user to subsample reads by one or more of the following criteria: longest subreads per CLR or random CLR selection. The Read Filtering tool provides options for normalizing data by filtering out certain low-quality scraps reads and/or by minimum CLR length. SequelTools is implemented in bash, R, and Python using only standard libraries and packages and is platform independent.

CONCLUSIONS

SequelTools is a program that provides the only free, fast, and easy-to-use quality control tool, and the only program providing this kind of read subsampling and read filtering for PacBio Sequel raw sequence data, and is available at https://github.com/ISUgenomics/SequelTools .

摘要

背景

PacBio 测序是一种非常有价值的第三代 DNA 测序方法,因为它具有非常长的读长、能够检测甲基化碱基以及实时测序方法。然而,迄今为止,还没有工具可用于分析 PacBio 数据的质量、抽样和过滤。

结果

这里我们介绍了 SequelTools,这是一个命令行程序,包含三个工具:质量控制、读取抽样和读取过滤。质量控制工具可快速处理来自多个 SMRTcell 的 PacBio Sequel 原始测序数据,生成多个描述数据质量的统计信息和出版质量的图,包括 N50、读长和计数统计、PSR 和 ZOR。读取抽样工具允许用户根据以下一个或多个标准对读取进行抽样:最长的每个 CLR 的子读取或随机 CLR 选择。读取过滤工具提供了通过过滤掉某些低质量的碎片读取和/或通过最小 CLR 长度来归一化数据的选项。SequelTools 是用 bash、R 和 Python 实现的,仅使用标准库和包,并且与平台无关。

结论

SequelTools 是一个程序,它提供了唯一免费、快速且易于使用的质量控制工具,也是唯一提供这种 PacBio Sequel 原始测序数据读取抽样和读取过滤的程序,可在 https://github.com/ISUgenomics/SequelTools 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a91/7532105/d20639f65f18/12859_2020_3751_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a91/7532105/80b001d53f9e/12859_2020_3751_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a91/7532105/04adef4a1359/12859_2020_3751_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a91/7532105/d20639f65f18/12859_2020_3751_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a91/7532105/80b001d53f9e/12859_2020_3751_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a91/7532105/04adef4a1359/12859_2020_3751_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a91/7532105/d20639f65f18/12859_2020_3751_Fig3_HTML.jpg

相似文献

1
SequelTools: a suite of tools for working with PacBio Sequel raw sequence data.SequelTools:一套用于处理 PacBio Sequel 原始序列数据的工具。
BMC Bioinformatics. 2020 Oct 1;21(1):429. doi: 10.1186/s12859-020-03751-8.
2
NPBSS: a new PacBio sequencing simulator for generating the continuous long reads with an empirical model.NPBSS:一种新的 PacBio 测序模拟器,用于基于经验模型生成连续的长读长。
BMC Bioinformatics. 2018 May 22;19(1):177. doi: 10.1186/s12859-018-2208-0.
3
Evaluating long-read de novo assembly tools for eukaryotic genomes: insights and considerations.评估真核生物基因组的长读长从头组装工具:见解与考虑。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad100. Epub 2023 Nov 24.
4
PaSS: a sequencing simulator for PacBio sequencing.PaSS:一种用于 PacBio 测序的测序模拟程序。
BMC Bioinformatics. 2019 Jun 21;20(1):352. doi: 10.1186/s12859-019-2901-7.
5
PacRAT: a program to improve barcode-variant mapping from PacBio long reads using multiple sequence alignment.PacRAT:一种利用多重序列比对提高 PacBio 长读段中条码变异映射的程序。
Bioinformatics. 2022 May 13;38(10):2927-2929. doi: 10.1093/bioinformatics/btac165.
6
Comparison of long-read sequencing technologies in interrogating bacteria and fly genomes.比较长读测序技术在细菌和果蝇基因组分析中的应用。
G3 (Bethesda). 2021 Jun 17;11(6). doi: 10.1093/g3journal/jkab083.
7
Improving the sensitivity of long read overlap detection using grouped short k-mer matches.利用分组短 k-mer 匹配提高长读重叠检测的灵敏度。
BMC Genomics. 2019 Apr 4;20(Suppl 2):190. doi: 10.1186/s12864-019-5475-x.
8
LRCstats, a tool for evaluating long reads correction methods.LRCstats,一种用于评估长读纠错方法的工具。
Bioinformatics. 2017 Nov 15;33(22):3652-3654. doi: 10.1093/bioinformatics/btx489.
9
Organelle_PBA, a pipeline for assembling chloroplast and mitochondrial genomes from PacBio DNA sequencing data.细胞器_PBA,一种用于从PacBio DNA测序数据组装叶绿体和线粒体基因组的流程。
BMC Genomics. 2017 Jan 7;18(1):49. doi: 10.1186/s12864-016-3412-9.
10
A Sequence-Based Novel Approach for Quality Evaluation of Third-Generation Sequencing Reads.基于序列的第三代测序读段质量评估新方法。
Genes (Basel). 2019 Jan 14;10(1):44. doi: 10.3390/genes10010044.

引用本文的文献

1
The crosstalk between host and rumen microbiome in cattle: insights from multi-omics approaches and genome-wide association studies.牛宿主与瘤胃微生物组之间的相互作用:多组学方法和全基因组关联研究的见解
World J Microbiol Biotechnol. 2025 Jul 28;41(8):267. doi: 10.1007/s11274-025-04504-6.
2
Leveraging long-read sequencing technologies for pharmacogenomic testing: applications, analytical strategies, challenges, and future perspectives.利用长读长测序技术进行药物基因组学检测:应用、分析策略、挑战及未来展望。
Front Genet. 2025 Apr 30;16:1435416. doi: 10.3389/fgene.2025.1435416. eCollection 2025.
3
De novo genome hybrid assembly and annotation of the endangered and euryhaline fish Aphanius iberus (Valenciennes, 1846) with identification of genes potentially involved in salinity adaptation.

本文引用的文献

1
Effect of sequence depth and length in long-read assembly of the maize inbred NC358.长读长序列深度和长度对玉米自交系 NC358 组装的影响。
Nat Commun. 2020 May 8;11(1):2288. doi: 10.1038/s41467-020-16037-7.
2
Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome.精确的圆形共识长读测序提高了人类基因组变异检测和组装的准确性。
Nat Biotechnol. 2019 Oct;37(10):1155-1162. doi: 10.1038/s41587-019-0217-9. Epub 2019 Aug 12.
3
MinIONQC: fast and simple quality control for MinION sequencing data.
濒危广盐性鱼类伊比利亚溪鳉(瓦伦西恩斯,1846年)的从头基因组杂交组装与注释,以及对可能参与盐度适应的基因的鉴定
BMC Genomics. 2025 Feb 12;26(1):136. doi: 10.1186/s12864-025-11327-0.
4
Visualizing metagenomic and metatranscriptomic data: A comprehensive review.宏基因组学和宏转录组学数据的可视化:全面综述
Comput Struct Biotechnol J. 2024 May 3;23:2011-2033. doi: 10.1016/j.csbj.2024.04.060. eCollection 2024 Dec.
5
Unraveling metagenomics through long-read sequencing: a comprehensive review.通过长读测序揭示宏基因组学:全面综述。
J Transl Med. 2024 Jan 28;22(1):111. doi: 10.1186/s12967-024-04917-1.
6
gen. nov., sp. nov., and sp. nov., gen. nov.: novel bacteria from the family isolated from the female genital tract.属名. 新种,种名. 新种,属名. 新种:从女性生殖道中分离到的科的新型细菌。
Int J Syst Evol Microbiol. 2023 Oct;73(10). doi: 10.1099/ijsem.0.006017.
7
Long-read genome assemblies for the study of chromosome expansion: Drosophila kikkawai, Drosophila takahashii, Drosophila bipectinata, and Drosophila ananassae.用于研究染色体扩张的长读基因组组装:黑腹果蝇、拟暗果蝇、双斑果绳和拟黑腹果蝇。
G3 (Bethesda). 2023 Sep 30;13(10). doi: 10.1093/g3journal/jkad191.
8
Long-read genome assemblies for the study of chromosome expansion: , , , and .用于染色体扩增研究的长读长基因组组装: , , ,以及 。
bioRxiv. 2023 May 24:2023.05.22.541758. doi: 10.1101/2023.05.22.541758.
9
Clinical Diagnostics of Bacterial Infections and Their Resistance to Antibiotics-Current State and Whole Genome Sequencing Implementation Perspectives.细菌感染的临床诊断及其对抗生素的耐药性——现状与全基因组测序实施前景
Antibiotics (Basel). 2023 Apr 19;12(4):781. doi: 10.3390/antibiotics12040781.
10
Long-read-based Genome Assembly of Drosophila gunungcola Reveals Fewer Chemosensory Genes in Flower-breeding Species.基于长读测序的高山果蝇基因组组装揭示了传粉物种中较少的化感器基因。
Genome Biol Evol. 2023 Mar 3;15(3). doi: 10.1093/gbe/evad048.
MinIONQC:适用于 MinION 测序数据的快速简单质控工具。
Bioinformatics. 2019 Feb 1;35(3):523-525. doi: 10.1093/bioinformatics/bty654.
4
Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics.单分子实时 (SMRT) 测序崭露头角:在医学诊断中的应用和用途。
Nucleic Acids Res. 2018 Mar 16;46(5):2159-2168. doi: 10.1093/nar/gky066.
5
Sequanix: a dynamic graphical interface for Snakemake workflows.Sequanix:用于 SnakeMake 工作流程的动态图形界面。
Bioinformatics. 2018 Jun 1;34(11):1934-1936. doi: 10.1093/bioinformatics/bty034.
6
Canu: scalable and accurate long-read assembly via adaptive -mer weighting and repeat separation.Canu:通过自适应k-mer加权和重复序列分离实现可扩展且准确的长读长序列拼接
Genome Res. 2017 May;27(5):722-736. doi: 10.1101/gr.215087.116. Epub 2017 Mar 15.
7
Coming of age: ten years of next-generation sequencing technologies.成年:下一代测序技术的十年
Nat Rev Genet. 2016 May 17;17(6):333-51. doi: 10.1038/nrg.2016.49.
8
PacBio Sequencing and Its Applications.PacBio测序技术及其应用。
Genomics Proteomics Bioinformatics. 2015 Oct;13(5):278-89. doi: 10.1016/j.gpb.2015.08.002. Epub 2015 Nov 2.
9
Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species.Assemblathon2:在三个脊椎动物物种中评估从头组装基因组方法。
Gigascience. 2013 Jul 22;2(1):10. doi: 10.1186/2047-217X-2-10.
10
Characterizing and measuring bias in sequence data.表征和测量序列数据中的偏差。
Genome Biol. 2013 May 29;14(5):R51. doi: 10.1186/gb-2013-14-5-r51.