Suppr超能文献

Bcftools mpileup 和 GATK HaplotypeCaller 在非人类物种变异调用中的评估。

The evaluation of Bcftools mpileup and GATK HaplotypeCaller for variant calling in non-human species.

机构信息

DGIMI, Univ Montpellier, INRAE, Montpellier, France.

出版信息

Sci Rep. 2022 Jul 5;12(1):11331. doi: 10.1038/s41598-022-15563-2.

Abstract

Identification of genetic variations is a central part of population and quantitative genomics studies based on high-throughput sequencing data. Even though popular variant callers such as Bcftools mpileup and GATK HaplotypeCaller were developed nearly 10 years ago, their performance is still largely unknown for non-human species. Here, we showed by benchmark analyses with a simulated insect population that Bcftools mpileup performs better than GATK HaplotypeCaller in terms of recovery rate and accuracy regardless of mapping software. The vast majority of false positives were observed from repeats, especially for GATK HaplotypeCaller. Variant scores calculated by GATK did not clearly distinguish true positives from false positives in the vast majority of cases, implying that hard-filtering with GATK could be challenging. These results suggest that Bcftools mpileup may be the first choice for non-human studies and that variants within repeats might have to be excluded for downstream analyses.

摘要

鉴定遗传变异是基于高通量测序数据的群体和数量基因组学研究的核心部分。尽管流行的变异调用程序,如 Bcftools mpileup 和 GATK HaplotypeCaller,已经开发了将近 10 年,但它们在非人类物种中的性能仍然很大程度上未知。在这里,我们通过对模拟昆虫种群的基准分析表明,无论使用哪种映射软件,Bcftools mpileup 在恢复率和准确性方面都优于 GATK HaplotypeCaller。绝大多数假阳性是从重复序列中观察到的,特别是对于 GATK HaplotypeCaller 而言。在绝大多数情况下,GATK 计算的变异得分不能清楚地区分真阳性和假阳性,这意味着使用 GATK 进行硬过滤可能具有挑战性。这些结果表明,Bcftools mpileup 可能是非人类研究的首选,并且重复序列内的变异可能需要在下游分析中排除。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd07/9256665/d453f06def72/41598_2022_15563_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验