Suppr超能文献

使用 Clair3-MP 结合多平台测序数据提高变异calling 性能。

Boosting variant-calling performance with multi-platform sequencing data using Clair3-MP.

机构信息

Department of Computer Science, The University of Hong Kong, Pok Fu Lam, Hong Kong SAR, China.

出版信息

BMC Bioinformatics. 2023 Aug 3;24(1):308. doi: 10.1186/s12859-023-05434-6.

Abstract

BACKGROUND

With the continuous advances in third-generation sequencing technology and the increasing affordability of next-generation sequencing technology, sequencing data from different sequencing technology platforms is becoming more common. While numerous benchmarking studies have been conducted to compare variant-calling performance across different platforms and approaches, little attention has been paid to the potential of leveraging the strengths of different platforms to optimize overall performance, especially integrating Oxford Nanopore and Illumina sequencing data.

RESULTS

We investigated the impact of multi-platform data on the performance of variant calling through carefully designed experiments with a deep learning-based variant caller named Clair3-MP (Multi-Platform). Through our research, we not only demonstrated the capability of ONT-Illumina data for improved variant calling, but also identified the optimal scenarios for utilizing ONT-Illumina data. In addition, we revealed that the improvement in variant calling using ONT-Illumina data comes from an improvement in difficult genomic regions, such as the large low-complexity regions and segmental and collapse duplication regions. Moreover, Clair3-MP can incorporate reference genome stratification information to achieve a small but measurable improvement in variant calling. Clair3-MP is accessible as an open-source project at: https://github.com/HKU-BAL/Clair3-MP .

CONCLUSIONS

These insights have important implications for researchers and practitioners alike, providing valuable guidance for improving the reliability and efficiency of genomic analysis in diverse applications.

摘要

背景

随着第三代测序技术的不断进步和下一代测序技术成本的不断降低,来自不同测序技术平台的测序数据变得越来越常见。虽然已经进行了许多基准测试研究来比较不同平台和方法的变异调用性能,但很少有人关注利用不同平台的优势来优化整体性能的潜力,特别是整合牛津纳米孔和 Illumina 测序数据。

结果

我们通过使用名为 Clair3-MP(多平台)的基于深度学习的变异调用器进行精心设计的实验,研究了多平台数据对变异调用性能的影响。通过我们的研究,我们不仅展示了 ONT-Illumina 数据在提高变异调用方面的能力,还确定了利用 ONT-Illumina 数据的最佳场景。此外,我们还发现,使用 ONT-Illumina 数据进行变异调用的改进来自于对困难基因组区域的改进,例如大的低复杂度区域和片段和崩溃重复区域。此外, Clair3-MP 可以合并参考基因组分层信息,以实现对变异调用的微小但可衡量的改进。 Clair3-MP 可在以下网址作为开源项目获取:https://github.com/HKU-BAL/Clair3-MP。

结论

这些见解对研究人员和从业者都具有重要意义,为提高基因组分析在各种应用中的可靠性和效率提供了有价值的指导。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9135/10401749/36e445a9dd8d/12859_2023_5434_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验