SVIM：基于比对的长读段的结构变异识别。

SVIM: structural variant identification using mapped long reads.

机构信息

Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany.

出版信息

Bioinformatics. 2019 Sep 1;35(17):2907-2915. doi: 10.1093/bioinformatics/btz041.

DOI:10.1093/bioinformatics/btz041

PMID:30668829

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6735718/

Abstract

MOTIVATION

Structural variants are defined as genomic variants larger than 50 bp. They have been shown to affect more bases in any given genome than single-nucleotide polymorphisms or small insertions and deletions. Additionally, they have great impact on human phenotype and diversity and have been linked to numerous diseases. Due to their size and association with repeats, they are difficult to detect by shotgun sequencing, especially when based on short reads. Long read, single-molecule sequencing technologies like those offered by Pacific Biosciences or Oxford Nanopore Technologies produce reads with a length of several thousand base pairs. Despite the higher error rate and sequencing cost, long-read sequencing offers many advantages for the detection of structural variants. Yet, available software tools still do not fully exploit the possibilities.

RESULTS

We present SVIM, a tool for the sensitive detection and precise characterization of structural variants from long-read data. SVIM consists of three components for the collection, clustering and combination of structural variant signatures from read alignments. It discriminates five different variant classes including similar types, such as tandem and interspersed duplications and novel element insertions. SVIM is unique in its capability of extracting both the genomic origin and destination of duplications. It compares favorably with existing tools in evaluations on simulated data and real datasets from Pacific Biosciences and Nanopore sequencing machines.

AVAILABILITY AND IMPLEMENTATION

The source code and executables of SVIM are available on Github: github.com/eldariont/svim. SVIM has been implemented in Python 3 and published on bioconda and the Python Package Index.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

结构变体被定义为大于 50bp 的基因组变体。它们已被证明在任何给定的基因组中影响的碱基比单核苷酸多态性或小插入和缺失更多。此外，它们对人类表型和多样性有很大的影响，并与许多疾病有关。由于它们的大小和与重复序列的关联，它们很难通过鸟枪法测序检测到，尤其是基于短读长时。Pacific Biosciences 或 Oxford Nanopore Technologies 等提供的长读长、单分子测序技术可产生数千个碱基对长的读长。尽管错误率和测序成本较高，但长读长测序在检测结构变体方面具有许多优势。然而，可用的软件工具仍未充分利用这些可能性。

结果

我们提出了 SVIM，这是一种用于从长读长数据中敏感检测和精确表征结构变体的工具。SVIM 由三个组件组成，用于从读长比对中收集、聚类和组合结构变体特征。它可区分包括串联和散布重复在内的五种不同的变体类型，以及新型元件插入。SVIM 的独特之处在于能够提取重复的基因组起源和目的地。它在模拟数据和来自 Pacific Biosciences 和 Nanopore 测序仪的真实数据集的评估中与现有工具相比表现出色。

可用性和实现

SVIM 的源代码和可执行文件可在 Github 上获得：github.com/eldariont/svim。SVIM 是用 Python 3 实现的，并在 bioconda 和 Python 包索引上发布。

补充信息

补充数据可在 Bioinformatics 在线获得。

相似文献

SVIM: structural variant identification using mapped long reads.

Bioinformatics. 2019 Sep 1;35(17):2907-2915. doi: 10.1093/bioinformatics/btz041.

SVIM-asm: structural variant detection from haploid and diploid genome assemblies.

Bioinformatics. 2021 Apr 1;36(22-23):5519-5521. doi: 10.1093/bioinformatics/btaa1034.

lordFAST: sensitive and Fast Alignment Search Tool for LOng noisy Read sequencing Data.

Bioinformatics. 2019 Jan 1;35(1):20-27. doi: 10.1093/bioinformatics/bty544.

Discovery of tandem and interspersed segmental duplications using high-throughput sequencing.

Bioinformatics. 2019 Oct 15;35(20):3923-3930. doi: 10.1093/bioinformatics/btz237.

BleTIES: annotation of natural genome editing in ciliates using long read sequencing.

Bioinformatics. 2021 Nov 5;37(21):3929-3931. doi: 10.1093/bioinformatics/btab613.

Evaluation of tools for long read RNA-seq splice-aware alignment.

Bioinformatics. 2018 Mar 1;34(5):748-754. doi: 10.1093/bioinformatics/btx668.

Noise-cancelling repeat finder: uncovering tandem repeats in error-prone long-read sequencing data.

Bioinformatics. 2019 Nov 1;35(22):4809-4811. doi: 10.1093/bioinformatics/btz484.

MsPAC: a tool for haplotype-phased structural variant detection.

Bioinformatics. 2020 Feb 1;36(3):922-924. doi: 10.1093/bioinformatics/btz618.

SVJedi: genotyping structural variations with long reads.

Bioinformatics. 2020 Nov 1;36(17):4568-4575. doi: 10.1093/bioinformatics/btaa527.

NucBreak: location of structural errors in a genome assembly by using paired-end Illumina reads.

BMC Bioinformatics. 2020 Feb 21;21(1):66. doi: 10.1186/s12859-020-3414-0.

引用本文的文献

BVSim: A benchmarking variation simulator mimicking human variation spectrum.

Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf095.

Structural Variants: Mechanisms, Mapping, and Interpretation in Human Genetics.

Genes (Basel). 2025 Jul 29;16(8):905. doi: 10.3390/genes16080905.

A Hitchhiker Guide to Structural Variant Calling: A Comprehensive Benchmark Through Different Sequencing Technologies.

Biomedicines. 2025 Aug 9;13(8):1949. doi: 10.3390/biomedicines13081949.

Long read whole genome sequencing-based discovery of structural variants and their role in aetiology of non-syndromic autism spectrum disorder in India.

BMC Med Genomics. 2025 Aug 20;18(1):131. doi: 10.1186/s12920-025-02204-6.

TRsv: simultaneous detection of tandem repeat variations, structural variations, and short indels using long read sequencing data.

Genome Biol. 2025 Aug 20;26(1):246. doi: 10.1186/s13059-025-03718-z.

How Structural Variations Influence Crop Improvement.

Int J Mol Sci. 2025 Jul 10;26(14):6635. doi: 10.3390/ijms26146635.

The crosstalk between host and rumen microbiome in cattle: insights from multi-omics approaches and genome-wide association studies.

World J Microbiol Biotechnol. 2025 Jul 28;41(8):267. doi: 10.1007/s11274-025-04504-6.

Complex genetic variation in nearly complete human genomes.

Nature. 2025 Jul 23. doi: 10.1038/s41586-025-09140-6.

ASVBM: Structural variant benchmarking with local joint analysis for multiple callsets.

Comput Struct Biotechnol J. 2025 Jun 29;27:2851-2862. doi: 10.1016/j.csbj.2025.06.045. eCollection 2025.

A haplotype-resolved pangenome of the barley wild relative Hordeum bulbosum.

Nature. 2025 Jul 9. doi: 10.1038/s41586-025-09270-x.

本文引用的文献

Multi-platform discovery of haplotype-resolved structural variation in human genomes.

Nat Commun. 2019 Apr 16;10(1):1784. doi: 10.1038/s41467-018-08148-z.

Minimap2: pairwise alignment for nucleotide sequences.

Bioinformatics. 2018 Sep 15;34(18):3094-3100. doi: 10.1093/bioinformatics/bty191.

Accurate detection of complex structural variations using single-molecule sequencing.

Nat Methods. 2018 Jun;15(6):461-468. doi: 10.1038/s41592-018-0001-7. Epub 2018 Apr 30.

Piercing the dark matter: bioinformatics of long-range sequencing and mapping.

Nat Rev Genet. 2018 Jun;19(6):329-346. doi: 10.1038/s41576-018-0003-4.

Nanopore sequencing and assembly of a human genome with ultra-long reads.

Nat Biotechnol. 2018 Apr;36(4):338-345. doi: 10.1038/nbt.4060. Epub 2018 Jan 29.

Long-read genome sequencing identifies causal structural variation in a Mendelian disease.

Genet Med. 2018 Jan;20(1):159-163. doi: 10.1038/gim.2017.86. Epub 2017 Jun 22.

Discovery and genotyping of structural variation from long-read haploid genome sequence data.

Genome Res. 2017 May;27(5):677-685. doi: 10.1101/gr.214007.116. Epub 2016 Nov 28.

SimLoRD: Simulation of Long Read Data.

Bioinformatics. 2016 Sep 1;32(17):2704-6. doi: 10.1093/bioinformatics/btw286. Epub 2016 May 10.

An Incomplete Understanding of Human Genetic Variation.

Genetics. 2016 Apr;202(4):1251-4. doi: 10.1534/genetics.115.180539.

Mechanisms underlying structural variant formation in genomic disorders.

Nat Rev Genet. 2016 Apr;17(4):224-38. doi: 10.1038/nrg.2015.25. Epub 2016 Feb 29.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

SVIM：基于比对的长读段的结构变异识别。

SVIM: structural variant identification using mapped long reads.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献