De Coster Wouter, Strazisar Mojca, De Rijk Peter
VIB-UAntwerp Center for Molecular Neurology, 2610 Antwerp, Belgium.
NAR Genom Bioinform. 2020 Jan 13;2(1):lqz027. doi: 10.1093/nargab/lqz027. eCollection 2020 Mar.
Long-read sequencing has substantial advantages for structural variant discovery and phasing of variants compared to short-read technologies, but the required and optimal read length has not been assessed. In this work, we used long reads simulated from human genomes and evaluated structural variant discovery and variant phasing using current best practice bioinformatics methods. We determined that optimal discovery of structural variants from human genomes can be obtained with reads of minimally 20 kb. Haplotyping variants across genes only reaches its optimum from reads of 100 kb. These findings are important for the design of future long-read sequencing projects.
与短读长技术相比,长读长测序在结构变异发现和变异定相方面具有显著优势,但所需的最佳读长尚未得到评估。在这项工作中,我们使用从人类基因组模拟的长读长,并使用当前最佳实践的生物信息学方法评估结构变异发现和变异定相。我们确定,从人类基因组中最佳发现结构变异可通过至少20 kb的读长实现。跨基因的单倍型变异只有在读长达到100 kb时才达到最佳状态。这些发现对未来长读长测序项目的设计很重要。