• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

TopoQual优化环状一致序列测序数据并准确预测质量分数。

TopoQual polishes circular consensus sequencing data and accurately predicts quality scores.

作者信息

Weerakoon Minindu, Lee Sangjin, Mitchell Emily, Heaton Haynes

机构信息

Auburn University, Auburn, AL, 36849, USA.

Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK.

出版信息

BMC Bioinformatics. 2025 Jan 16;26(1):17. doi: 10.1186/s12859-024-06020-0.

DOI:10.1186/s12859-024-06020-0
PMID:39815230
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11737182/
Abstract

BACKGROUND

Pacific Biosciences (PacBio) circular consensus sequencing (CCS), also known as high fidelity (HiFi) technology, has revolutionized modern genomics by producing long (10 + kb) and highly accurate reads. This is achieved by sequencing circularized DNA molecules multiple times and combining them into a consensus sequence. Currently, the accuracy and quality value estimation provided by HiFi technology are more than sufficient for applications such as genome assembly and germline variant calling. However, there are limitations in the accuracy of the estimated quality scores when it comes to somatic variant calling on single reads.

RESULTS

To address the challenge of inaccurate quality scores for somatic variant calling, we introduce TopoQual, a novel tool designed to enhance the accuracy of base quality predictions. TopoQual leverages techniques including partial order alignments (POA), topologically parallel bases, and deep learning algorithms to polish consensus sequences. Our results demonstrate that TopoQual corrects approximately 31.9% of errors in PacBio consensus sequences. Additionally, it validates base qualities up to q59, which corresponds to one error in 0.9 million bases. These improvements will significantly enhance the reliability of somatic variant calling using HiFi data.

CONCLUSION

TopoQual represents a significant advancement in genomics by improving the accuracy of base quality predictions for PacBio HiFi sequencing data. By correcting a substantial proportion of errors and achieving high base quality validation, TopoQual enables confident and accurate somatic variant calling. This tool not only addresses a critical limitation of current HiFi technology but also opens new possibilities for precise genomic analysis in various research and clinical applications.

摘要

背景

太平洋生物科学公司(PacBio)的环形一致序列测序(CCS),也称为高保真(HiFi)技术,通过生成长(10 + kb)且高度准确的 reads,彻底改变了现代基因组学。这是通过对环形化的 DNA 分子进行多次测序并将它们组合成一个一致序列来实现的。目前,HiFi 技术提供的准确性和质量值估计对于基因组组装和种系变异检测等应用来说已经绰绰有余。然而,在对单条 reads 进行体细胞变异检测时,估计质量得分的准确性存在局限性。

结果

为了解决体细胞变异检测中质量得分不准确的挑战,我们引入了 TopoQual,这是一种旨在提高碱基质量预测准确性的新型工具。TopoQual 利用包括偏序比对(POA)、拓扑平行碱基和深度学习算法等技术来优化一致序列。我们的结果表明,TopoQual 纠正了 PacBio 一致序列中约 31.9%的错误。此外,它能验证高达 q59 的碱基质量,这相当于每 90 万个碱基中有一个错误。这些改进将显著提高使用 HiFi 数据进行体细胞变异检测的可靠性。

结论

TopoQual 通过提高 PacBio HiFi 测序数据的碱基质量预测准确性,代表了基因组学领域的一项重大进展。通过纠正相当一部分错误并实现高碱基质量验证,TopoQual 能够进行可靠且准确的体细胞变异检测。该工具不仅解决了当前 HiFi 技术的一个关键限制,还为各种研究和临床应用中的精确基因组分析开辟了新的可能性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbe6/11737182/ec0439afe264/12859_2024_6020_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbe6/11737182/e67497c40a65/12859_2024_6020_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbe6/11737182/22b6abd8d4fb/12859_2024_6020_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbe6/11737182/ee6e6d7e62b0/12859_2024_6020_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbe6/11737182/ec0439afe264/12859_2024_6020_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbe6/11737182/e67497c40a65/12859_2024_6020_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbe6/11737182/22b6abd8d4fb/12859_2024_6020_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbe6/11737182/ee6e6d7e62b0/12859_2024_6020_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbe6/11737182/ec0439afe264/12859_2024_6020_Fig3_HTML.jpg

相似文献

1
TopoQual polishes circular consensus sequencing data and accurately predicts quality scores.TopoQual优化环状一致序列测序数据并准确预测质量分数。
BMC Bioinformatics. 2025 Jan 16;26(1):17. doi: 10.1186/s12859-024-06020-0.
2
NextPolish2: A Repeat-aware Polishing Tool for Genomes Assembled Using HiFi Long Reads.NextPolish2:一种针对使用 HiFi 长读长组装的基因组进行重复感知优化的工具。
Genomics Proteomics Bioinformatics. 2024 May 9;22(1). doi: 10.1093/gpbjnl/qzad009.
3
Clustering of circular consensus sequences: accurate error correction and assembly of single molecule real-time reads from multiplexed amplicon libraries.环形一致性序列聚类:从多重扩增子文库中对单分子实时读取进行精确纠错和组装。
BMC Bioinformatics. 2018 Aug 20;19(1):302. doi: 10.1186/s12859-018-2293-0.
4
Long amplicon HiFi sequencing for mitochondrial DNA genomes.长扩增子 HiFi 测序用于线粒体 DNA 基因组。
Mol Ecol Resour. 2023 Jul;23(5):1014-1022. doi: 10.1111/1755-0998.13765. Epub 2023 Feb 18.
5
Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore.比较两种最新的基因组组装测序技术:太平洋生物科学测序仪二代系统的 HiFi 读取和牛津纳米孔的超长读取。
Gigascience. 2020 Dec 15;9(12). doi: 10.1093/gigascience/giaa123.
6
Error analysis of the PacBio sequencing CCS reads.CCS 读段 PacBio 测序错误分析。
Int J Biostat. 2023 May 8;19(2):439-453. doi: 10.1515/ijb-2021-0091. eCollection 2023 Nov 1.
7
Highly accurate long reads are crucial for realizing the potential of biodiversity genomics.高质量的长读长序列对于实现生物多样性基因组学的潜力至关重要。
BMC Genomics. 2023 Mar 16;24(1):117. doi: 10.1186/s12864-023-09193-9.
8
Evaluating long-read de novo assembly tools for eukaryotic genomes: insights and considerations.评估真核生物基因组的长读长从头组装工具:见解与考虑。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad100. Epub 2023 Nov 24.
9
DeepConsensus improves the accuracy of sequences with a gap-aware sequence transformer.DeepConsensus 通过具有间隙感知序列转换器提高序列的准确性。
Nat Biotechnol. 2023 Feb;41(2):232-238. doi: 10.1038/s41587-022-01435-7. Epub 2022 Sep 1.
10
HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.HiCanu:从高保真长读段中精确组装片段重复、卫星和等位基因变体。
Genome Res. 2020 Sep;30(9):1291-1305. doi: 10.1101/gr.263566.120. Epub 2020 Aug 14.

引用本文的文献

1
Evaluation of sequencing reads at scale using rdeval.使用rdeval大规模评估测序读数。
Bioinformatics. 2025 Jul 22. doi: 10.1093/bioinformatics/btaf416.
2
Evaluation of sequencing reads at scale using rdeval.使用rdeval对大规模测序读数进行评估。
bioRxiv. 2025 Feb 8:2025.02.01.636073. doi: 10.1101/2025.02.01.636073.

本文引用的文献

1
A granularity-level information fusion strategy on hypergraph transformer for predicting synergistic effects of anticancer drugs.基于超图Transformer 的粒度级信息融合策略预测抗癌药物协同作用
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad522.
2
Towards routine chromosome-scale haplotype-resolved reconstruction in cancer genomics.迈向癌症基因组学中常规的染色体级单体型解析重建。
Nat Commun. 2023 Mar 13;14(1):1358. doi: 10.1038/s41467-023-36689-5.
3
Telomere-to-telomere assembly of diploid chromosomes with Verkko.利用 Verkko 进行二倍体染色体的端粒到端粒组装。
Nat Biotechnol. 2023 Oct;41(10):1474-1482. doi: 10.1038/s41587-023-01662-6. Epub 2023 Feb 16.
4
DeepConsensus improves the accuracy of sequences with a gap-aware sequence transformer.DeepConsensus 通过具有间隙感知序列转换器提高序列的准确性。
Nat Biotechnol. 2023 Feb;41(2):232-238. doi: 10.1038/s41587-022-01435-7. Epub 2022 Sep 1.
5
Duplex-Repair enables highly accurate sequencing, despite DNA damage.双修复使测序高度准确,尽管存在 DNA 损伤。
Nucleic Acids Res. 2022 Jan 11;50(1):e1. doi: 10.1093/nar/gkab855.
6
Minimizer-space de Bruijn graphs: Whole-genome assembly of long reads in minutes on a personal computer.最小化空间 de Bruijn 图:在个人计算机上数分钟内完成长读段的全基因组组装。
Cell Syst. 2021 Oct 20;12(10):958-968.e6. doi: 10.1016/j.cels.2021.08.009. Epub 2021 Sep 14.
7
Computational methods for chromosome-scale haplotype reconstruction.染色体级别的单倍型重构的计算方法。
Genome Biol. 2021 Apr 12;22(1):101. doi: 10.1186/s13059-021-02328-9.
8
Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm.使用带有 hifiasm 的相定装配图进行单体型解析从头组装。
Nat Methods. 2021 Feb;18(2):170-175. doi: 10.1038/s41592-020-01056-5. Epub 2021 Feb 1.
9
A Field Guide to Eukaryotic Transposable Elements.真核转座元件野外手册。
Annu Rev Genet. 2020 Nov 23;54:539-561. doi: 10.1146/annurev-genet-040620-022145. Epub 2020 Sep 21.
10
HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.HiCanu:从高保真长读段中精确组装片段重复、卫星和等位基因变体。
Genome Res. 2020 Sep;30(9):1291-1305. doi: 10.1101/gr.263566.120. Epub 2020 Aug 14.