Suppr超能文献

家族三联体中新生突变的高效识别:一种基于共识的信息学方法。

Efficient identification of de novo mutations in family trios: a consensus-based informatic approach.

作者信息

Shadrina Mariya, Kalay Özem, Demirkaya-Budak Sinem, LeDuc Charles A, Chung Wendy K, Turgut Deniz, Budak Gungor, Arslan Elif, Semenyuk Vladimir, Davis-Dusenbery Brandi, Seidman Christine E, Yost H Joseph, Jain Amit, Gelb Bruce D

机构信息

Mindich Child Health and Development Institute and the Department of Genetics and Genomic Sciences, Icahn School of Medicine, New York, NY, USA.

Velsera Inc., Charlestown, MA, USA.

出版信息

Life Sci Alliance. 2025 Mar 28;8(6). doi: 10.26508/lsa.202403039. Print 2025 Jun.

Abstract

Accurate identification of de novo variants (DNVs) remains challenging despite advances in sequencing technologies, often requiring ad hoc filters and manual inspection. Here, we explored a purely informatic, consensus-based approach for identifying DNVs in proband-parent trios using short-read genome sequencing data. We evaluated variant calls generated by three sequence analysis pipelines-GATK HaplotypeCaller, DeepTrio, and Velsera GRAF-and examined the assumption that a requirement of consensus can serve as an effective filter for high-quality DNVs. Comparison with a highly accurate DNV set, validated previously by manual inspection and Sanger sequencing, demonstrated that consensus filtering, followed by a force-calling procedure, effectively removed false-positive calls, achieving 98.0-99.4% precision. At the same time, sensitivity of the workflow based on the previously established DNVs reached 99.4%. Validation in the HG002-3-4 Genome-in-a-Bottle trio confirmed its robustness, with precision reaching 99.2% and sensitivity up to 96.6%. We believe that this consensus approach can be widely implemented as an automated bioinformatics workflow suitable for large-scale analyses without the need for manual intervention, especially when very high precision is valued over sensitivity.

摘要

尽管测序技术有所进步,但准确识别新生变异(DNV)仍然具有挑战性,通常需要临时筛选和人工检查。在这里,我们探索了一种基于共识的纯信息学方法,用于使用短读长基因组测序数据在先证者-父母三联体中识别DNV。我们评估了由三个序列分析流程——GATK HaplotypeCaller、DeepTrio和Velsera GRAF生成的变异调用,并检验了共识要求可作为高质量DNV有效筛选条件的假设。与先前通过人工检查和桑格测序验证的高度准确的DNV集进行比较,结果表明,在进行强制调用程序后,共识筛选有效地去除了假阳性调用,精度达到98.0-99.4%。同时,基于先前确定的DNV的工作流程的灵敏度达到99.4%。在HG002-3-4基因组瓶三联体中的验证证实了其稳健性,精度达到99.2%,灵敏度高达96.6%。我们认为,这种共识方法可以广泛作为一种自动化生物信息学工作流程实施,适用于大规模分析,无需人工干预,特别是在更看重高精度而非灵敏度的情况下。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0558/11953573/1bb2f0ac8141/LSA-2024-03039_GA.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验