• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于分析NGS种系基因检测板的映射、变异位点检测和区间填充流程的性能评估。

Performance evaluation of pipelines for mapping, variant calling and interval padding, for the analysis of NGS germline panels.

作者信息

Zanti Maria, Michailidou Kyriaki, Loizidou Maria A, Machattou Christina, Pirpa Panagiota, Christodoulou Kyproula, Spyrou George M, Kyriacou Kyriacos, Hadjisavvas Andreas

机构信息

Department of Electron Microscopy/Molecular Pathology, The Cyprus Institute of Neurology and Genetics, 2371, Nicosia, Cyprus.

Cyprus School of Molecular Medicine, 2371, Nicosia, Cyprus.

出版信息

BMC Bioinformatics. 2021 Apr 28;22(1):218. doi: 10.1186/s12859-021-04144-1.

DOI:10.1186/s12859-021-04144-1
PMID:33910496
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8080428/
Abstract

BACKGROUND

Next-generation sequencing (NGS) represents a significant advancement in clinical genetics. However, its use creates several technical, data interpretation and management challenges. It is essential to follow a consistent data analysis pipeline to achieve the highest possible accuracy and avoid false variant calls. Herein, we aimed to compare the performance of twenty-eight combinations of NGS data analysis pipeline compartments, including short-read mapping (BWA-MEM, Bowtie2, Stampy), variant calling (GATK-HaplotypeCaller, GATK-UnifiedGenotyper, SAMtools) and interval padding (null, 50 bp, 100 bp) methods, along with a commercially available pipeline (BWA Enrichment, Illumina®). Fourteen germline DNA samples from breast cancer patients were sequenced using a targeted NGS panel approach and subjected to data analysis.

RESULTS

We highlight that interval padding is required for the accurate detection of intronic variants including spliceogenic pathogenic variants (PVs). In addition, using nearly default parameters, the BWA Enrichment algorithm, failed to detect these spliceogenic PVs and a missense PV in the TP53 gene. We also recommend the BWA-MEM algorithm for sequence alignment, whereas variant calling should be performed using a combination of variant calling algorithms; GATK-HaplotypeCaller and SAMtools for the accurate detection of insertions/deletions and GATK-UnifiedGenotyper for the efficient detection of single nucleotide variant calls.

CONCLUSIONS

These findings have important implications towards the identification of clinically actionable variants through panel testing in a clinical laboratory setting, when dedicated bioinformatics personnel might not always be available. The results also reveal the necessity of improving the existing tools and/or at the same time developing new pipelines to generate more reliable and more consistent data.

摘要

背景

下一代测序(NGS)代表了临床遗传学的一项重大进展。然而,其应用带来了一些技术、数据解读和管理方面的挑战。遵循一致的数据分析流程对于实现尽可能高的准确性并避免错误的变异调用至关重要。在此,我们旨在比较28种NGS数据分析流程组件组合的性能,包括短读长比对(BWA-MEM、Bowtie2、Stampy)、变异调用(GATK-HaplotypeCaller、GATK-UnifiedGenotyper、SAMtools)和区间填充(无、50bp、100bp)方法,以及一种商业可用流程(BWA富集,Illumina®)。使用靶向NGS面板方法对14例乳腺癌患者的种系DNA样本进行测序并进行数据分析。

结果

我们强调,对于准确检测包括剪接致病变异(PVs)在内的内含子变异,需要进行区间填充。此外,使用几乎默认的参数时,BWA富集算法未能检测到这些剪接致病PVs以及TP53基因中的一个错义PV。我们还推荐使用BWA-MEM算法进行序列比对,而变异调用应使用变异调用算法的组合;使用GATK-HaplotypeCaller和SAMtools准确检测插入/缺失,使用GATK-UnifiedGenotyper高效检测单核苷酸变异调用。

结论

这些发现对于在临床实验室环境中通过面板检测识别临床可操作变异具有重要意义,因为此时可能并不总是有专业的生物信息学人员。结果还揭示了改进现有工具和/或同时开发新流程以生成更可靠、更一致数据的必要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6723/8080428/678005103990/12859_2021_4144_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6723/8080428/2d1f6e26bb5f/12859_2021_4144_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6723/8080428/c486207fc2c0/12859_2021_4144_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6723/8080428/5c81afd26e35/12859_2021_4144_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6723/8080428/1533f6a254ef/12859_2021_4144_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6723/8080428/678005103990/12859_2021_4144_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6723/8080428/2d1f6e26bb5f/12859_2021_4144_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6723/8080428/c486207fc2c0/12859_2021_4144_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6723/8080428/5c81afd26e35/12859_2021_4144_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6723/8080428/1533f6a254ef/12859_2021_4144_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6723/8080428/678005103990/12859_2021_4144_Fig5_HTML.jpg

相似文献

1
Performance evaluation of pipelines for mapping, variant calling and interval padding, for the analysis of NGS germline panels.用于分析NGS种系基因检测板的映射、变异位点检测和区间填充流程的性能评估。
BMC Bioinformatics. 2021 Apr 28;22(1):218. doi: 10.1186/s12859-021-04144-1.
2
Evaluation of an optimized germline exomes pipeline using BWA-MEM2 and Dragen-GATK tools.使用 BWA-MEM2 和 Dragen-GATK 工具评估优化后的种系外显子组管道。
PLoS One. 2023 Aug 3;18(8):e0288371. doi: 10.1371/journal.pone.0288371. eCollection 2023.
3
Validation and assessment of variant calling pipelines for next-generation sequencing.下一代测序变异检测流程的验证与评估
Hum Genomics. 2014 Jul 30;8(1):14. doi: 10.1186/1479-7364-8-14.
4
Variant callers for next-generation sequencing data: a comparison study.下一代测序数据的变异调用者:一项比较研究。
PLoS One. 2013 Sep 27;8(9):e75619. doi: 10.1371/journal.pone.0075619. eCollection 2013.
5
MutAid: Sanger and NGS Based Integrated Pipeline for Mutation Identification, Validation and Annotation in Human Molecular Genetics.MutAid:基于桑格测序法和新一代测序技术的综合流程,用于人类分子遗传学中的突变鉴定、验证及注释
PLoS One. 2016 Feb 3;11(2):e0147697. doi: 10.1371/journal.pone.0147697. eCollection 2016.
6
Evaluation of variant calling tools for large plant genome re-sequencing.评价用于大型植物基因组重测序的变异调用工具。
BMC Bioinformatics. 2020 Aug 17;21(1):360. doi: 10.1186/s12859-020-03704-1.
7
Evaluating the Calling Performance of a Rare Disease NGS Panel for Single Nucleotide and Copy Number Variants.评估用于单核苷酸和拷贝数变异的罕见病二代测序(NGS)面板的检测性能
Mol Diagn Ther. 2017 Jun;21(3):303-313. doi: 10.1007/s40291-017-0268-x.
8
Calling known variants and identifying new variants while rapidly aligning sequence data.在快速对齐序列数据的同时,调用已知变异体并识别新变异体。
J Dairy Sci. 2019 Apr;102(4):3216-3229. doi: 10.3168/jds.2018-15172. Epub 2019 Feb 14.
9
Systematic comparison of variant calling pipelines using gold standard personal exome variants.使用金标准个人外显子变体对变异检测流程进行系统比较。
Sci Rep. 2015 Dec 7;5:17875. doi: 10.1038/srep17875.
10
Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing.多种变异calling 管道一致性低:外显子组和基因组测序的实际影响。
Genome Med. 2013 Mar 27;5(3):28. doi: 10.1186/gm432. eCollection 2013.

引用本文的文献

1
Proteogenomic analysis dissects early-onset breast cancer patients with prognostic relevance.蛋白质基因组学分析剖析了具有预后相关性的早发性乳腺癌患者。
Exp Mol Med. 2024 Nov;56(11):2382-2394. doi: 10.1038/s12276-024-01332-w. Epub 2024 Nov 1.
2
Machine learning on alignment features for parent-of-origin classification of simulated hybrid RNA-seq.基于比对特征的机器学习方法用于模拟杂交 RNA-seq 的亲本来源分类。
BMC Bioinformatics. 2024 Mar 12;25(1):109. doi: 10.1186/s12859-024-05728-3.
3
Short-read aligner performance in germline variant identification.

本文引用的文献

1
The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.马修斯相关系数(MCC)在二分类评估中优于 F1 得分和准确率的优势。
BMC Genomics. 2020 Jan 2;21(1):6. doi: 10.1186/s12864-019-6413-7.
2
The complexity of screening PMS2 in DNA isolated from formalin-fixed paraffin-embedded material.从福尔马林固定石蜡包埋材料中提取的 DNA 中筛查 PMS2 的复杂性。
Eur J Hum Genet. 2020 Mar;28(3):333-338. doi: 10.1038/s41431-019-0527-x. Epub 2019 Oct 15.
3
Li-Fraumeni syndrome: not a straightforward diagnosis anymore-the interpretation of pathogenic variants of low allele frequency and the differences between germline PVs, mosaicism, and clonal hematopoiesis.
短读比对工具在种系变异识别中的性能表现。
Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad480.
李-佛美尼综合征:诊断不再简单——低等位基因频率致病变异的解读及种系 PV、嵌合体和克隆性造血之间的差异。
Breast Cancer Res. 2019 Sep 18;21(1):107. doi: 10.1186/s13058-019-1193-1.
4
Systematic comparison of germline variant calling pipelines cross multiple next-generation sequencers.跨多种下一代测序仪的种系变异调用管道的系统比较。
Sci Rep. 2019 Jun 27;9(1):9345. doi: 10.1038/s41598-019-45835-3.
5
Performance assessment of variant calling pipelines using human whole exome sequencing and simulated data.使用人类全外显子组测序和模拟数据评估变异调用管道的性能。
BMC Bioinformatics. 2019 Jun 17;20(1):342. doi: 10.1186/s12859-019-2928-9.
6
Targeted NGS Platforms for Genetic Screening and Gene Discovery in Primary Immunodeficiencies.靶向 NGS 平台在原发性免疫缺陷病中的遗传筛查和基因发现。
Front Immunol. 2019 Apr 11;10:316. doi: 10.3389/fimmu.2019.00316. eCollection 2019.
7
A study on fast calling variants from next-generation sequencing data using decision tree.使用决策树从下一代测序数据中快速调用变异体的研究。
BMC Bioinformatics. 2018 Apr 19;19(1):145. doi: 10.1186/s12859-018-2147-9.
8
Performance evaluation method for read mapping tool in clinical panel sequencing.临床Panel测序中读段比对工具的性能评估方法
Genes Genomics. 2018;40(2):189-197. doi: 10.1007/s13258-017-0621-9. Epub 2017 Nov 9.
9
Comparison of Burrows-Wheeler Transform-Based Mapping Algorithms Used in High-Throughput Whole-Genome Sequencing: Application to Illumina Data for Livestock Genomes.用于高通量全基因组测序的基于Burrows-Wheeler变换的映射算法比较:在牲畜基因组Illumina数据中的应用
Front Genet. 2018 Feb 26;9:35. doi: 10.3389/fgene.2018.00035. eCollection 2018.
10
Validation of a Next-Generation Sequencing Pipeline for the Molecular Diagnosis of Multiple Inherited Cancer Predisposing Syndromes.用于多种遗传性癌症易感性综合征分子诊断的新一代测序流程的验证
J Mol Diagn. 2017 Jul;19(4):502-513. doi: 10.1016/j.jmoldx.2017.05.001. Epub 2017 May 18.