Mahmoud Medhat, Agustinho Daniel P, Sedlazeck Fritz J
Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA.
Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;
Genome Res. 2025 Apr 14;35(4):545-558. doi: 10.1101/gr.279975.124.
Over the past decade, long-read sequencing has evolved into a pivotal technology for uncovering the hidden and complex regions of the genome. Significant cost efficiency, scalability, and accuracy advancements have driven this evolution. Concurrently, novel analytical methods have emerged to harness the full potential of long reads. These advancements have enabled milestones such as the first fully completed human genome, enhanced identification and understanding of complex genomic variants, and deeper insights into the interplay between epigenetics and genomic variation. This mini-review provides a comprehensive overview of the latest developments in long-read DNA sequencing analysis, encompassing reference-based and de novo assembly approaches. We explore the entire workflow, from initial data processing to variant calling and annotation, focusing on how these methods improve our ability to interpret a wide array of genomic variants. Additionally, we discuss the current challenges, limitations, and future directions in the field, offering a detailed examination of the state-of-the-art bioinformatics methods for long-read sequencing.
在过去十年中,长读长测序已发展成为揭示基因组中隐藏和复杂区域的关键技术。显著的成本效益、可扩展性和准确性提升推动了这一发展。与此同时,新的分析方法也应运而生,以充分发挥长读长的潜力。这些进展促成了一些里程碑,比如首个完全完成的人类基因组、对复杂基因组变异的识别和理解得到增强,以及对表观遗传学与基因组变异之间相互作用有了更深入的见解。本综述对长读长DNA测序分析的最新进展进行了全面概述,涵盖基于参考序列的组装方法和从头组装方法。我们探讨了从初始数据处理到变异检测和注释的整个工作流程,重点关注这些方法如何提高我们解读各种基因组变异的能力。此外,我们还讨论了该领域当前面临的挑战、局限性和未来方向,详细审视了长读长测序的前沿生物信息学方法。