使用 cuteSV 进行基于长读长的人类基因组结构变异检测。

Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, Heilongjiang, China.

Nebula Genomics, Harbin, 150030, Heilongjiang, China.

Genome Biol. 2020 Aug 3;21(1):189. doi: 10.1186/s13059-020-02107-y.

Long-read sequencing is promising for the comprehensive discovery of structural variations (SVs). However, it is still non-trivial to achieve high yields and performance simultaneously due to the complex SV signatures implied by noisy long reads. We propose cuteSV, a sensitive, fast, and scalable long-read-based SV detection approach. cuteSV uses tailored methods to collect the signatures of various types of SVs and employs a clustering-and-refinement method to implement sensitive SV detection. Benchmarks on simulated and real long-read sequencing datasets demonstrate that cuteSV has higher yields and scaling performance than state-of-the-art tools. cuteSV is available at https://github.com/tjiangHIT/cuteSV .

长读测序在全面发现结构变异（SVs）方面具有广阔的前景。然而，由于嘈杂的长读序列中隐含的复杂 SV 特征，要同时实现高产率和高性能仍然具有挑战性。我们提出了 cuteSV，这是一种基于长读序列的敏感、快速和可扩展的 SV 检测方法。cuteSV 使用定制的方法来收集各种类型 SV 的特征，并采用聚类和细化方法来实现敏感的 SV 检测。在模拟和真实的长读测序数据集上的基准测试表明，cuteSV 比最先进的工具具有更高的产量和扩展性能。cuteSV 可在 https://github.com/tjiangHIT/cuteSV 上获得。