• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Repun:一种用于多种测序平台的准确小型变异表示统一方法。

Repun: an accurate small variant representation unification method for multiple sequencing platforms.

机构信息

Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road, Hong Kong, 999077, China.

Faculty of Computing, Harbin Institute of Technology, 92 Xidazhi Street, Nangang District, Harbin, Heilongjiang 150001, China.

出版信息

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae613.

DOI:10.1093/bib/bbae613
PMID:39584701
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11586763/
Abstract

Ensuring a unified variant representation aligning the sequencing data is critical for downstream analysis as variant representation may differ across platforms and sequencing conditions. Current approaches typically treat variant unification as a post-step following variant calling and are incapable of measuring the correct variant representation from the outset. Aligning variant representations with the alignment before variant calling has benefits like providing reliable training labels for deep learning-based variant caller model training and enabling direct assessment of alignment quality. However, it also poses challenges due to the large number of candidates to handle. Here, we present Repun, a haplotype-aware variant-alignment unification algorithm that harmonizes the variant representation between provided variants and alignments in different sequencing platforms. Repun leverages phasing to facilitate equivalent haplotype matches between variants and alignments. Our approach reduced the comparisons between variant haplotypes and candidate haplotypes by utilizing haplotypes with read evidence to speed up the unification process. Repun achieved >99.99% precision and > 99.5% recall through extensive evaluations of various Genome in a Bottle Consortium samples encompassing three sequencing platforms: Oxford Nanopore Technology, Pacific Biosciences, and Illumina. Repun is open-source and available at (https://github.com/zhengzhenxian/Repun).

摘要

确保测序数据的变异体表示一致对于下游分析至关重要,因为变异体表示可能因平台和测序条件而异。目前的方法通常将变异体统一视为变异体调用后的后处理步骤,无法从一开始就测量正确的变异体表示。在进行变异体调用之前对齐变异体表示具有一些优势,例如为基于深度学习的变异体调用模型训练提供可靠的训练标签,并能够直接评估对齐质量。然而,由于需要处理的候选数量庞大,这也带来了挑战。在这里,我们提出了 Repun,这是一种基于单倍型的变异体对齐统一算法,可协调不同测序平台中提供的变异体和比对之间的变异体表示。Repun 利用相位信息来促进变异体和比对之间的等效单倍型匹配。我们的方法通过利用具有读取证据的单倍型来加快统一过程,从而减少了变异体单倍型和候选单倍型之间的比较。通过对涵盖三个测序平台(Oxford Nanopore Technology、Pacific Biosciences 和 Illumina)的各种 Genome in a Bottle 联盟样本进行广泛评估,Repun 实现了 >99.99%的精度和 >99.5%的召回率。Repun 是开源的,并可在 (https://github.com/zhengzhenxian/Repun) 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6dcf/11586763/d1fce0a1d8a2/bbae613f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6dcf/11586763/fe4ccfc71541/bbae613f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6dcf/11586763/8f4c7dcd58e3/bbae613f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6dcf/11586763/dbc9a2bc5f80/bbae613f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6dcf/11586763/04c282c26751/bbae613f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6dcf/11586763/d1fce0a1d8a2/bbae613f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6dcf/11586763/fe4ccfc71541/bbae613f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6dcf/11586763/8f4c7dcd58e3/bbae613f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6dcf/11586763/dbc9a2bc5f80/bbae613f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6dcf/11586763/04c282c26751/bbae613f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6dcf/11586763/d1fce0a1d8a2/bbae613f5.jpg

相似文献

1
Repun: an accurate small variant representation unification method for multiple sequencing platforms.Repun:一种用于多种测序平台的准确小型变异表示统一方法。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae613.
2
A Long-Read Sequencing Approach for Direct Haplotype Phasing in Clinical Settings.一种在临床环境中直接进行单体型定相的长读测序方法。
Int J Mol Sci. 2020 Dec 1;21(23):9177. doi: 10.3390/ijms21239177.
3
Local read haplotagging enables accurate long-read small variant calling.局部读取标签化可实现长读长小型变异calling 的准确性。
Nat Commun. 2024 Jul 13;15(1):5907. doi: 10.1038/s41467-024-50079-5.
4
Benchmarking reveals superiority of deep learning variant callers on bacterial nanopore sequence data.基准测试显示深度学习变异调用程序在细菌纳米孔测序数据上的优越性。
Elife. 2024 Oct 10;13:RP98300. doi: 10.7554/eLife.98300.
5
Longshot enables accurate variant calling in diploid genomes from single-molecule long read sequencing.Longshot 可通过单分子长读测序对二倍体基因组进行准确的变异调用。
Nat Commun. 2019 Oct 11;10(1):4660. doi: 10.1038/s41467-019-12493-y.
6
miniSNV: accurate and fast single nucleotide variant calling from nanopore sequencing data.miniSNV:从纳米孔测序数据中进行准确快速的单核苷酸变异calling。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae473.
7
DCHap: A Divide-and-Conquer Haplotype Phasing Algorithm for Third-Generation Sequences.DCHap:一种用于第三代测序的分治单倍型相位算法。
IEEE/ACM Trans Comput Biol Bioinform. 2022 May-Jun;19(3):1277-1284. doi: 10.1109/TCBB.2020.3005673. Epub 2022 Jun 3.
8
Isaac: ultra-fast whole-genome secondary analysis on Illumina sequencing platforms.艾萨克:在 Illumina 测序平台上进行超快速全基因组二级分析。
Bioinformatics. 2013 Aug 15;29(16):2041-3. doi: 10.1093/bioinformatics/btt314. Epub 2013 Jun 4.
9
Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data.利用跨越多个单核苷酸多态性的读取信息,从测序数据中推断单倍型。
Bioinformatics. 2013 Sep 15;29(18):2245-52. doi: 10.1093/bioinformatics/btt386. Epub 2013 Jul 3.
10
Fast and sensitive mapping of nanopore sequencing reads with GraphMap.使用GraphMap对纳米孔测序读数进行快速灵敏的映射
Nat Commun. 2016 Apr 15;7:11307. doi: 10.1038/ncomms11307.

引用本文的文献

1
Indel calling from ONT sequencing data of family trios via sparse attention and 3D convolution.通过稀疏注意力和3D卷积从家系三联体的ONT测序数据中进行插入缺失检测。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf430.