Human Genome Sequencing Center Baylor College of Medicine, Houston, TX, USA.
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
Nat Biotechnol. 2024 Oct;42(10):1571-1580. doi: 10.1038/s41587-023-02024-y. Epub 2024 Jan 2.
Calling structural variations (SVs) is technically challenging, but using long reads remains the most accurate way to identify complex genomic alterations. Here we present Sniffles2, which improves over current methods by implementing a repeat aware clustering coupled with a fast consensus sequence and coverage-adaptive filtering. Sniffles2 is 11.8 times faster and 29% more accurate than state-of-the-art SV callers across different coverages (5-50×), sequencing technologies (ONT and HiFi) and SV types. Furthermore, Sniffles2 solves the problem of family-level to population-level SV calling to produce fully genotyped VCF files. Across 11 probands, we accurately identified causative SVs around MECP2, including highly complex alleles with three overlapping SVs. Sniffles2 also enables the detection of mosaic SVs in bulk long-read data. As a result, we identified multiple mosaic SVs in brain tissue from a patient with multiple system atrophy. The identified SV showed a remarkable diversity within the cingulate cortex, impacting both genes involved in neuron function and repetitive elements.
检测结构变异(SVs)具有一定的技术难度,但使用长读长仍然是识别复杂基因组改变的最准确方法。在这里,我们提出了 Sniffles2,它通过实现重复感知聚类以及快速共识序列和覆盖自适应过滤,在当前方法的基础上进行了改进。在不同覆盖度(5-50×)、测序技术(ONT 和 HiFi)和 SV 类型下,Sniffles2 的速度比最先进的 SV 调用器快 11.8 倍,准确性提高 29%。此外,Sniffles2 解决了从家族水平到群体水平的 SV 调用问题,从而生成完全基因分型的 VCF 文件。在 11 个先证者中,我们准确地鉴定了 MECP2 周围的致病 SV,包括具有三个重叠 SV 的高度复杂等位基因。Sniffles2 还能够在大量长读长数据中检测嵌合体 SV。结果,我们在一名多系统萎缩症患者的脑组织中鉴定出多个嵌合体 SV。鉴定出的 SV 在扣带回皮层内表现出显著的多样性,影响了神经元功能和重复元件相关的基因。