• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

elPrep 5 中的多线程变异调用。

Multithreaded variant calling in elPrep 5.

机构信息

ExaScience Life Lab, imec, Leuven, Belgium.

Department of Information Technology, Ghent University - imec, Ghent, Belgium.

出版信息

PLoS One. 2021 Feb 4;16(2):e0244471. doi: 10.1371/journal.pone.0244471. eCollection 2021.

DOI:10.1371/journal.pone.0244471
PMID:33539352
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7861424/
Abstract

We present elPrep 5, which updates the elPrep framework for processing sequencing alignment/map files with variant calling. elPrep 5 can now execute the full pipeline described by the GATK Best Practices for variant calling, which consists of PCR and optical duplicate marking, sorting by coordinate order, base quality score recalibration, and variant calling using the haplotype caller algorithm. elPrep 5 produces identical BAM and VCF output as GATK4 while significantly reducing the runtime by parallelizing and merging the execution of the pipeline steps. Our benchmarks show that elPrep 5 speeds up the runtime of the variant calling pipeline by a factor 8-16x on both whole-exome and whole-genome data while using the same hardware resources as GATK4. This makes elPrep 5 a suitable drop-in replacement for GATK4 when faster execution times are needed.

摘要

我们介绍了 elPrep 5,它更新了 elPrep 框架,用于处理具有变体调用的测序比对/映射文件。elPrep 5 现在可以执行 GATK 变体调用最佳实践所描述的完整管道,该管道包括 PCR 和光学重复标记、按坐标顺序排序、碱基质量评分重新校准以及使用单倍型调用算法进行变体调用。elPrep 5 生成与 GATK4 相同的 BAM 和 VCF 输出,同时通过并行化和合并管道步骤的执行显著减少了运行时间。我们的基准测试表明,elPrep 5 在使用与 GATK4 相同的硬件资源的情况下,将全外显子组和全基因组数据的变体调用管道的运行时间加快了 8-16 倍。这使得 elPrep 5 在需要更快的执行时间时成为 GATK4 的合适替代品。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7685/7861424/74e29bb64adc/pone.0244471.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7685/7861424/e7a7f6cea88a/pone.0244471.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7685/7861424/06c930bca814/pone.0244471.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7685/7861424/3d2da024906f/pone.0244471.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7685/7861424/a75bfe820794/pone.0244471.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7685/7861424/43160ddd8897/pone.0244471.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7685/7861424/74e29bb64adc/pone.0244471.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7685/7861424/e7a7f6cea88a/pone.0244471.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7685/7861424/06c930bca814/pone.0244471.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7685/7861424/3d2da024906f/pone.0244471.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7685/7861424/a75bfe820794/pone.0244471.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7685/7861424/43160ddd8897/pone.0244471.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7685/7861424/74e29bb64adc/pone.0244471.g006.jpg

相似文献

1
Multithreaded variant calling in elPrep 5.elPrep 5 中的多线程变异调用。
PLoS One. 2021 Feb 4;16(2):e0244471. doi: 10.1371/journal.pone.0244471. eCollection 2021.
2
elPrep 4: A multithreaded framework for sequence analysis.elPrep 4:一个用于序列分析的多线程框架。
PLoS One. 2019 Feb 13;14(2):e0209523. doi: 10.1371/journal.pone.0209523. eCollection 2019.
3
elPrep: High-Performance Preparation of Sequence Alignment/Map Files for Variant Calling.elPrep:用于变异检测的序列比对/映射文件的高性能制备
PLoS One. 2015 Jul 16;10(7):e0132868. doi: 10.1371/journal.pone.0132868. eCollection 2015.
4
Halvade somatic: Somatic variant calling with Apache Spark.半体变异体调用:基于 Apache Spark 的半体变异体调用。
Gigascience. 2022 Jan 12;11(1). doi: 10.1093/gigascience/giab094.
5
Impact of post-alignment processing in variant discovery from whole exome data.全外显子数据变异发现中比对后处理的影响
BMC Bioinformatics. 2016 Oct 3;17(1):403. doi: 10.1186/s12859-016-1279-z.
6
ADS-HCSpark: A scalable HaplotypeCaller leveraging adaptive data segmentation to accelerate variant calling on Spark.ADS-HCSpark:一种可扩展的基于 Spark 的单倍型调用程序,利用自适应数据分段来加速变异调用。
BMC Bioinformatics. 2019 Feb 14;20(1):76. doi: 10.1186/s12859-019-2665-0.
7
UNDR ROVER - a fast and accurate variant caller for targeted DNA sequencing.UNDR ROVER——一种用于靶向DNA测序的快速且准确的变异检测工具。
BMC Bioinformatics. 2016 Apr 16;17:165. doi: 10.1186/s12859-016-1014-9.
8
A comparison of three programming languages for a full-fledged next-generation sequencing tool.三种编程语言在全功能下一代测序工具中的比较。
BMC Bioinformatics. 2019 Jun 3;20(1):301. doi: 10.1186/s12859-019-2903-5.
9
Comparing Ease of Programming in C++, Go, and Java for Implementing a Next-Generation Sequencing Tool.比较使用C++、Go和Java实现下一代测序工具时的编程难易程度。
Evol Bioinform Online. 2019 Aug 15;15:1176934319869015. doi: 10.1177/1176934319869015. eCollection 2019.
10
Validation and assessment of variant calling pipelines for next-generation sequencing.下一代测序变异检测流程的验证与评估
Hum Genomics. 2014 Jul 30;8(1):14. doi: 10.1186/1479-7364-8-14.

引用本文的文献

1
Pangenome graphs and their applications in biodiversity genomics.泛基因组图谱及其在生物多样性基因组学中的应用。
Nat Genet. 2025 Jan;57(1):13-26. doi: 10.1038/s41588-024-02029-6. Epub 2025 Jan 8.
2
Genomic diversity of the Japanese wheat core collection and selection of alleles for agronomic traits in the breeding process.日本小麦核心种质的基因组多样性及育种过程中农艺性状等位基因的选择
Breed Sci. 2024 Jun;74(3):259-273. doi: 10.1270/jsbbs.23064. Epub 2024 Jun 25.
3
Global invasion history with climate-related allele frequency shifts in the invasive Mediterranean fruit fly (Diptera, Tephritidae: Ceratitis capitata).

本文引用的文献

1
Phase 2 study of afatinib among patients with recurrent and/or metastatic esophageal squamous cell carcinoma.阿法替尼治疗复发性和/或转移性食管鳞癌患者的 II 期研究。
Cancer. 2020 Oct 15;126(20):4521-4531. doi: 10.1002/cncr.33123. Epub 2020 Aug 4.
2
A comparison of three programming languages for a full-fledged next-generation sequencing tool.三种编程语言在全功能下一代测序工具中的比较。
BMC Bioinformatics. 2019 Jun 3;20(1):301. doi: 10.1186/s12859-019-2903-5.
3
Preeclampsia is Associated with Sex-Specific Transcriptional and Proteomic Changes in Fetal Erythroid Cells.
全球入侵历史与气候相关等位基因频率在入侵地中海实蝇(双翅目:实蝇科:地中海实蝇)中的变化。
Sci Rep. 2024 Oct 26;14(1):25549. doi: 10.1038/s41598-024-76390-1.
4
COSAP: Comparative Sequencing Analysis Platform.COSAP:比较测序分析平台。
BMC Bioinformatics. 2024 Mar 26;25(1):130. doi: 10.1186/s12859-024-05756-z.
5
A pangenome graph reference of 30 chicken genomes allows genotyping of large and complex structural variants.30 个鸡基因组的泛基因组图谱参考可对大型和复杂结构变异进行基因分型。
BMC Biol. 2023 Nov 22;21(1):267. doi: 10.1186/s12915-023-01758-0.
6
Genomic adaptive potential to cold environments in the invasive red swamp crayfish.入侵性红沼泽小龙虾对寒冷环境的基因组适应潜力。
iScience. 2023 Jul 3;26(8):107267. doi: 10.1016/j.isci.2023.107267. eCollection 2023 Aug 18.
7
DeltaMSI: artificial intelligence-based modeling of microsatellite instability scoring on next-generation sequencing data.DeltaMSI:基于人工智能的下一代测序数据微卫星不稳定性评分建模。
BMC Bioinformatics. 2023 Mar 1;24(1):73. doi: 10.1186/s12859-023-05186-3.
8
in the Indian Ocean: A tale of two invasions.在印度洋:两个入侵的故事。
Evol Appl. 2022 Dec 1;16(1):48-61. doi: 10.1111/eva.13507. eCollection 2023 Jan.
9
From molecules to genomic variations: Accelerating genome analysis via intelligent algorithms and architectures.从分子到基因组变异:通过智能算法和架构加速基因组分析
Comput Struct Biotechnol J. 2022 Aug 18;20:4579-4599. doi: 10.1016/j.csbj.2022.08.019. eCollection 2022.
10
Halvade somatic: Somatic variant calling with Apache Spark.半体变异体调用:基于 Apache Spark 的半体变异体调用。
Gigascience. 2022 Jan 12;11(1). doi: 10.1093/gigascience/giab094.
子痫前期与胎儿红细胞中性别特异性转录组和蛋白质组变化相关。
Int J Mol Sci. 2019 Apr 25;20(8):2038. doi: 10.3390/ijms20082038.
4
elPrep 4: A multithreaded framework for sequence analysis.elPrep 4:一个用于序列分析的多线程框架。
PLoS One. 2019 Feb 13;14(2):e0209523. doi: 10.1371/journal.pone.0209523. eCollection 2019.
5
Halvade-RNA: Parallel variant calling from transcriptomic data using MapReduce.Halvade-RNA:使用MapReduce从转录组数据中并行进行变异检测
PLoS One. 2017 Mar 30;12(3):e0174575. doi: 10.1371/journal.pone.0174575. eCollection 2017.
6
Impact of post-alignment processing in variant discovery from whole exome data.全外显子数据变异发现中比对后处理的影响
BMC Bioinformatics. 2016 Oct 3;17(1):403. doi: 10.1186/s12859-016-1279-z.
7
Evaluating the necessity of PCR duplicate removal from next-generation sequencing data and a comparison of approaches.评估从下一代测序数据中去除PCR重复的必要性及方法比较。
BMC Bioinformatics. 2016 Jul 25;17 Suppl 7(Suppl 7):239. doi: 10.1186/s12859-016-1097-3.
8
elPrep: High-Performance Preparation of Sequence Alignment/Map Files for Variant Calling.elPrep:用于变异检测的序列比对/映射文件的高性能制备
PLoS One. 2015 Jul 16;10(7):e0132868. doi: 10.1371/journal.pone.0132868. eCollection 2015.
9
Halvade: scalable sequence analysis with MapReduce.Halvade:使用MapReduce进行可扩展序列分析。
Bioinformatics. 2015 Aug 1;31(15):2482-8. doi: 10.1093/bioinformatics/btv179. Epub 2015 Mar 26.
10
From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline.从FastQ数据到高可信度变异检测:基因组分析工具包最佳实践流程
Curr Protoc Bioinformatics. 2013;43(1110):11.10.1-11.10.33. doi: 10.1002/0471250953.bi1110s43.