Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, Heilongjiang, China.
Nebula Genomics, Harbin, 150030, Heilongjiang, China.
Genome Biol. 2020 Aug 3;21(1):189. doi: 10.1186/s13059-020-02107-y.
Long-read sequencing is promising for the comprehensive discovery of structural variations (SVs). However, it is still non-trivial to achieve high yields and performance simultaneously due to the complex SV signatures implied by noisy long reads. We propose cuteSV, a sensitive, fast, and scalable long-read-based SV detection approach. cuteSV uses tailored methods to collect the signatures of various types of SVs and employs a clustering-and-refinement method to implement sensitive SV detection. Benchmarks on simulated and real long-read sequencing datasets demonstrate that cuteSV has higher yields and scaling performance than state-of-the-art tools. cuteSV is available at https://github.com/tjiangHIT/cuteSV .
长读测序在全面发现结构变异(SVs)方面具有广阔的前景。然而,由于嘈杂的长读序列中隐含的复杂 SV 特征,要同时实现高产率和高性能仍然具有挑战性。我们提出了 cuteSV,这是一种基于长读序列的敏感、快速和可扩展的 SV 检测方法。cuteSV 使用定制的方法来收集各种类型 SV 的特征,并采用聚类和细化方法来实现敏感的 SV 检测。在模拟和真实的长读测序数据集上的基准测试表明,cuteSV 比最先进的工具具有更高的产量和扩展性能。cuteSV 可在 https://github.com/tjiangHIT/cuteSV 上获得。