Suppr超能文献

用于分析NGS种系基因检测板的映射、变异位点检测和区间填充流程的性能评估。

Performance evaluation of pipelines for mapping, variant calling and interval padding, for the analysis of NGS germline panels.

作者信息

Zanti Maria, Michailidou Kyriaki, Loizidou Maria A, Machattou Christina, Pirpa Panagiota, Christodoulou Kyproula, Spyrou George M, Kyriacou Kyriacos, Hadjisavvas Andreas

机构信息

Department of Electron Microscopy/Molecular Pathology, The Cyprus Institute of Neurology and Genetics, 2371, Nicosia, Cyprus.

Cyprus School of Molecular Medicine, 2371, Nicosia, Cyprus.

出版信息

BMC Bioinformatics. 2021 Apr 28;22(1):218. doi: 10.1186/s12859-021-04144-1.

Abstract

BACKGROUND

Next-generation sequencing (NGS) represents a significant advancement in clinical genetics. However, its use creates several technical, data interpretation and management challenges. It is essential to follow a consistent data analysis pipeline to achieve the highest possible accuracy and avoid false variant calls. Herein, we aimed to compare the performance of twenty-eight combinations of NGS data analysis pipeline compartments, including short-read mapping (BWA-MEM, Bowtie2, Stampy), variant calling (GATK-HaplotypeCaller, GATK-UnifiedGenotyper, SAMtools) and interval padding (null, 50 bp, 100 bp) methods, along with a commercially available pipeline (BWA Enrichment, Illumina®). Fourteen germline DNA samples from breast cancer patients were sequenced using a targeted NGS panel approach and subjected to data analysis.

RESULTS

We highlight that interval padding is required for the accurate detection of intronic variants including spliceogenic pathogenic variants (PVs). In addition, using nearly default parameters, the BWA Enrichment algorithm, failed to detect these spliceogenic PVs and a missense PV in the TP53 gene. We also recommend the BWA-MEM algorithm for sequence alignment, whereas variant calling should be performed using a combination of variant calling algorithms; GATK-HaplotypeCaller and SAMtools for the accurate detection of insertions/deletions and GATK-UnifiedGenotyper for the efficient detection of single nucleotide variant calls.

CONCLUSIONS

These findings have important implications towards the identification of clinically actionable variants through panel testing in a clinical laboratory setting, when dedicated bioinformatics personnel might not always be available. The results also reveal the necessity of improving the existing tools and/or at the same time developing new pipelines to generate more reliable and more consistent data.

摘要

背景

下一代测序(NGS)代表了临床遗传学的一项重大进展。然而,其应用带来了一些技术、数据解读和管理方面的挑战。遵循一致的数据分析流程对于实现尽可能高的准确性并避免错误的变异调用至关重要。在此,我们旨在比较28种NGS数据分析流程组件组合的性能,包括短读长比对(BWA-MEM、Bowtie2、Stampy)、变异调用(GATK-HaplotypeCaller、GATK-UnifiedGenotyper、SAMtools)和区间填充(无、50bp、100bp)方法,以及一种商业可用流程(BWA富集,Illumina®)。使用靶向NGS面板方法对14例乳腺癌患者的种系DNA样本进行测序并进行数据分析。

结果

我们强调,对于准确检测包括剪接致病变异(PVs)在内的内含子变异,需要进行区间填充。此外,使用几乎默认的参数时,BWA富集算法未能检测到这些剪接致病PVs以及TP53基因中的一个错义PV。我们还推荐使用BWA-MEM算法进行序列比对,而变异调用应使用变异调用算法的组合;使用GATK-HaplotypeCaller和SAMtools准确检测插入/缺失,使用GATK-UnifiedGenotyper高效检测单核苷酸变异调用。

结论

这些发现对于在临床实验室环境中通过面板检测识别临床可操作变异具有重要意义,因为此时可能并不总是有专业的生物信息学人员。结果还揭示了改进现有工具和/或同时开发新流程以生成更可靠、更一致数据的必要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6723/8080428/2d1f6e26bb5f/12859_2021_4144_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验