• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于PacBio RS的CCS reads错误评估与质量控制的基准研究

A Benchmark Study on Error Assessment and Quality Control of CCS Reads Derived from the PacBio RS.

作者信息

Jiao Xiaoli, Zheng Xin, Ma Liang, Kutty Geetha, Gogineni Emile, Sun Qiang, Sherman Brad T, Hu Xiaojun, Jones Kristine, Raley Castle, Tran Bao, Munroe David J, Stephens Robert, Liang Dun, Imamichi Tomozumi, Kovacs Joseph A, Lempicki Richard A, Huang Da Wei

机构信息

Laboratory of Immunopathogenesis and Bioinformatics, SAIC-Frederick, Inc., Frederick National Laboratory, MD 21702, USA.

出版信息

J Data Mining Genomics Proteomics. 2013 Jul 31;4(3). doi: 10.4172/2153-0602.1000136.

DOI:10.4172/2153-0602.1000136
PMID:24179701
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3811116/
Abstract

PacBio RS, a newly emerging third-generation DNA sequencing platform, is based on a real-time, single-molecule, nano-nitch sequencing technology that can generate very long reads (up to 20-kb) in contrast to the shorter reads produced by the first and second generation sequencing technologies. As a new platform, it is important to assess the sequencing error rate, as well as the quality control (QC) parameters associated with the PacBio sequence data. In this study, a mixture of 10 prior known, closely related DNA amplicons were sequenced using the PacBio RS sequencing platform. After aligning Circular Consensus Sequence (CCS) reads derived from the above sequencing experiment to the known reference sequences, we found that the median error rate was 2.5% without read QC, and improved to 1.3% with an SVM based multi-parameter QC method. In addition, a assembly was used as a downstream application to evaluate the effects of different QC approaches. This benchmark study indicates that even though CCS reads are post error-corrected it is still necessary to perform appropriate QC on CCS reads in order to produce successful downstream bioinformatics analytical results.

摘要

PacBio RS是一种新兴的第三代DNA测序平台,它基于实时、单分子、纳米孔测序技术,与第一代和第二代测序技术产生的较短读长相比,该技术能够生成非常长的读长(长达20 kb)。作为一个新平台,评估测序错误率以及与PacBio序列数据相关的质量控制(QC)参数非常重要。在本研究中,使用PacBio RS测序平台对10个先前已知的、密切相关的DNA扩增子混合物进行了测序。将上述测序实验得到的环形一致序列(CCS)读段与已知参考序列比对后,我们发现,在未进行读段质量控制时,中位错误率为2.5%,而采用基于支持向量机的多参数质量控制方法后,错误率降至1.3%。此外,还将组装作为下游应用来评估不同质量控制方法的效果。这项基准研究表明,即使CCS读段经过了错误校正,为了获得成功的下游生物信息学分析结果,仍有必要对CCS读段进行适当的质量控制。

相似文献

1
A Benchmark Study on Error Assessment and Quality Control of CCS Reads Derived from the PacBio RS.基于PacBio RS的CCS reads错误评估与质量控制的基准研究
J Data Mining Genomics Proteomics. 2013 Jul 31;4(3). doi: 10.4172/2153-0602.1000136.
2
Comparison of ONT and CCS sequencing technologies on the polyploid genome of a medicinal plant showed that high error rate of ONT reads are not suitable for self-correction.对一种药用植物多倍体基因组上的纳米孔测序(ONT)技术和环形一致序列(CCS)测序技术进行比较后发现,ONT读数的高错误率不适用于自我校正。
Chin Med. 2022 Aug 9;17(1):94. doi: 10.1186/s13020-022-00644-1.
3
Error analysis of the PacBio sequencing CCS reads.CCS 读段 PacBio 测序错误分析。
Int J Biostat. 2023 May 8;19(2):439-453. doi: 10.1515/ijb-2021-0091. eCollection 2023 Nov 1.
4
Evaluating long-read de novo assembly tools for eukaryotic genomes: insights and considerations.评估真核生物基因组的长读长从头组装工具:见解与考虑。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad100. Epub 2023 Nov 24.
5
NPBSS: a new PacBio sequencing simulator for generating the continuous long reads with an empirical model.NPBSS:一种新的 PacBio 测序模拟器,用于基于经验模型生成连续的长读长。
BMC Bioinformatics. 2018 May 22;19(1):177. doi: 10.1186/s12859-018-2208-0.
6
A comprehensive investigation of metagenome assembly by linked-read sequencing.基于链接读取测序的宏基因组组装综合研究。
Microbiome. 2020 Nov 11;8(1):156. doi: 10.1186/s40168-020-00929-3.
7
Distinguishing highly similar gene isoforms with a clustering-based bioinformatics analysis of PacBio single-molecule long reads.通过对PacBio单分子长读段进行基于聚类的生物信息学分析来区分高度相似的基因异构体。
BioData Min. 2016 Apr 5;9:13. doi: 10.1186/s13040-016-0090-8. eCollection 2016.
8
An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome.评价 PacBio RS 平台在叶绿体基因组测序和从头组装方面的应用。
BMC Genomics. 2013 Oct 1;14:670. doi: 10.1186/1471-2164-14-670.
9
Long-Read Metagenomics Improves the Recovery of Viral Diversity from Complex Natural Marine Samples.长读宏基因组提高了从复杂自然海洋样本中病毒多样性的恢复。
mSystems. 2022 Jun 28;7(3):e0019222. doi: 10.1128/msystems.00192-22. Epub 2022 Jun 13.
10
NmTHC: a hybrid error correction method based on a generative neural machine translation model with transfer learning.NmTHC:一种基于具有迁移学习的生成式神经机器翻译模型的混合错误纠正方法。
BMC Genomics. 2024 Jun 7;25(1):573. doi: 10.1186/s12864-024-10446-4.

引用本文的文献

1
Endosymbiotic Fungal Diversity and Dynamics of the Brown Planthopper across Developmental Stages, Tissues, and Sexes Revealed Using Circular Consensus Sequencing.利用环形一致序列测序揭示褐飞虱不同发育阶段、组织和性别的内共生真菌多样性及动态变化
Insects. 2024 Jan 29;15(2):87. doi: 10.3390/insects15020087.
2
From Genomics to Metagenomics in the Era of Recent Sequencing Technologies.从基因组学到元基因组学:近期测序技术时代的发展
Methods Mol Biol. 2023;2649:1-20. doi: 10.1007/978-1-0716-3072-3_1.
3
Advances in the oral microbiota and rapid detection of oral infectious diseases.

本文引用的文献

1
Characterizing and measuring bias in sequence data.表征和测量序列数据中的偏差。
Genome Biol. 2013 May 29;14(5):R51. doi: 10.1186/gb-2013-14-5-r51.
2
Pacific biosciences sequencing technology for genotyping and variation discovery in human data.太平洋生物科学公司的测序技术,用于人类数据的基因分型和变异发现。
BMC Genomics. 2012 Aug 5;13:375. doi: 10.1186/1471-2164-13-375.
3
A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers.三代测序平台的故事:Ion Torrent、Pacific Biosciences 和 Illumina MiSeq 测序仪的比较。
口腔微生物群的进展与口腔感染性疾病的快速检测
Front Microbiol. 2023 Feb 6;14:1121737. doi: 10.3389/fmicb.2023.1121737. eCollection 2023.
4
Mutational spectrum of hepatitis C virus in patients with chronic hepatitis C determined by single molecule real-time sequencing.采用单分子实时测序技术测定慢性丙型肝炎患者丙型肝炎病毒的突变谱。
Sci Rep. 2022 Apr 30;12(1):7083. doi: 10.1038/s41598-022-11151-6.
5
Genotyping of familial Mediterranean fever gene (MEFV)-Single nucleotide polymorphism-Comparison of Nanopore with conventional Sanger sequencing.家系性地中海热基因(MEFV)-单核苷酸多态性-纳米孔与常规 Sanger 测序比较的基因分型。
PLoS One. 2022 Mar 17;17(3):e0265622. doi: 10.1371/journal.pone.0265622. eCollection 2022.
6
A Refined View of Airway Microbiome in Chronic Obstructive Pulmonary Disease at Species and Strain-Levels.慢性阻塞性肺疾病气道微生物群在物种和菌株水平上的精细视图
Front Microbiol. 2020 Jul 30;11:1758. doi: 10.3389/fmicb.2020.01758. eCollection 2020.
7
A Rosaceae Family-Level Approach To Identify Loci Influencing Soluble Solids Content in Blackberry for DNA-Informed Breeding.一种基于蔷薇科科级水平的方法来鉴定影响黑莓可溶性固形物含量的基因座,用于基于DNA的育种。
G3 (Bethesda). 2020 Oct 5;10(10):3729-3740. doi: 10.1534/g3.120.401449.
8
Metagenomic approaches in microbial ecology: an update on whole-genome and marker gene sequencing analyses.微生物生态学中的宏基因组学方法:全基因组和标记基因测序分析的最新进展。
Microb Genom. 2020 Aug;6(8). doi: 10.1099/mgen.0.000409. Epub 2020 Jul 24.
9
Diversity and Complexity of the Large Surface Protein Family in the Compacted Genomes of Multiple Species.多种物种压缩基因组中大型表面蛋白家族的多样性和复杂性。
mBio. 2020 Mar 3;11(2):e02878-19. doi: 10.1128/mBio.02878-19.
10
A Review on Viral Metagenomics in Extreme Environments.极端环境下病毒宏基因组学综述
Front Microbiol. 2019 Oct 18;10:2403. doi: 10.3389/fmicb.2019.02403. eCollection 2019.
BMC Genomics. 2012 Jul 24;13:341. doi: 10.1186/1471-2164-13-341.
4
Improving genome assemblies by sequencing PCR products with PacBio.利用 PacBio 测序 PCR 产物提高基因组组装质量。
Biotechniques. 2012 Jul;53(1):61-2. doi: 10.2144/0000113891.
5
Hybrid error correction and de novo assembly of single-molecule sequencing reads.单分子测序reads 的混合纠错与从头组装。
Nat Biotechnol. 2012 Jul 1;30(7):693-700. doi: 10.1038/nbt.2280.
6
A hybrid approach for the automated finishing of bacterial genomes.一种用于细菌基因组自动完成的混合方法。
Nat Biotechnol. 2012 Jul 1;30(7):701-707. doi: 10.1038/nbt.2288.
7
Optimized filtering reduces the error rate in detecting genomic variants by short-read sequencing.优化过滤可降低短读测序检测基因组变异的错误率。
Nat Biotechnol. 2011 Dec 18;30(1):61-8. doi: 10.1038/nbt.2053.
8
Real-time sequencing.实时测序
Nat Rev Microbiol. 2011 Aug 12;9(9):633. doi: 10.1038/nrmicro2638.
9
Landscape of next-generation sequencing technologies.下一代测序技术全景
Anal Chem. 2011 Jun 15;83(12):4327-41. doi: 10.1021/ac2010857. Epub 2011 May 25.
10
Field guide to next-generation DNA sequencers.下一代 DNA 测序仪使用指南。
Mol Ecol Resour. 2011 Sep;11(5):759-69. doi: 10.1111/j.1755-0998.2011.03024.x. Epub 2011 May 19.