Suppr超能文献

结构变异断点的读取深度分布图谱集。

A collection of read depth profiles at structural variant breakpoints.

机构信息

Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, 199004, Russia.

出版信息

Sci Data. 2023 Apr 6;10(1):186. doi: 10.1038/s41597-023-02076-4.

Abstract

SWaveform, a newly created open genome-wide resource for read depth signal in the vicinity of structural variant (SV) breakpoints, aims to boost development of computational tools and algorithms for discovery of genomic rearrangement events from sequencing data. SVs are a dominant force shaping genomes and substantially contributing to genetic diversity. Still, there are challenges in reliable and efficient genotyping of SVs from whole genome sequencing data, thus delaying translation into clinical applications and wasting valuable resources. SWaveform includes a database containing ~7 M of read depth profiles at SV breakpoints extracted from 911 sequencing samples generated by the Human Genome Diversity Project, generalised patterns of the signal at breakpoints, an interface for navigation and download, as well as a toolbox for local deployment with user's data. The dataset can be of immense value to bioinformatics and engineering communities as it empowers smooth application of intelligent signal processing and machine learning techniques for discovery of genomic rearrangement events and thus opens the floodgates for development of innovative algorithms and software.

摘要

SWaveform 是一个新创建的开放全基因组资源,用于读取结构变异 (SV) 断点附近的深度信号,旨在推动开发用于从测序数据中发现基因组重排事件的计算工具和算法。SV 是塑造基因组并为遗传多样性做出重大贡献的主要力量。然而,从全基因组测序数据中可靠而有效地对 SV 进行基因分型仍然具有挑战性,从而延迟了其向临床应用的转化,并浪费了宝贵的资源。SWaveform 包括一个数据库,其中包含从人类基因组多样性计划生成的 911 个测序样本中提取的约 700 万条 SV 断点处的读取深度分布,以及在断点处信号的一般模式、用于导航和下载的界面,以及用于用户数据的本地部署的工具箱。该数据集对于生物信息学和工程界具有巨大的价值,因为它能够为智能信号处理和机器学习技术在发现基因组重排事件中的应用提供便利,从而为创新算法和软件的开发开辟了道路。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96c0/10079824/387841a6813a/41597_2023_2076_Fig1_HTML.jpg

相似文献

1
A collection of read depth profiles at structural variant breakpoints.
Sci Data. 2023 Apr 6;10(1):186. doi: 10.1038/s41597-023-02076-4.
2
GGTyper: genotyping complex structural variants using short-read sequencing data.
Bioinformatics. 2024 Sep 1;40(Suppl 2):ii11-ii19. doi: 10.1093/bioinformatics/btae391.
4
Comparison of multiple algorithms to reliably detect structural variants in pears.
BMC Genomics. 2020 Jan 20;21(1):61. doi: 10.1186/s12864-020-6455-x.
5
Deciphering the exact breakpoints of structural variations using long sequencing reads with DeBreak.
Nat Commun. 2023 Jan 17;14(1):283. doi: 10.1038/s41467-023-35996-1.
6
7
PGG.SV: a whole-genome-sequencing-based structural variant resource and data analysis platform.
Nucleic Acids Res. 2023 Jan 6;51(D1):D1109-D1116. doi: 10.1093/nar/gkac905.
8
A survey of algorithms for the detection of genomic structural variants from long-read sequencing data.
Nat Methods. 2023 Aug;20(8):1143-1158. doi: 10.1038/s41592-023-01932-w. Epub 2023 Jun 29.
9
Structural variant analysis for linked-read sequencing data with gemtools.
Bioinformatics. 2019 Nov 1;35(21):4397-4399. doi: 10.1093/bioinformatics/btz239.
10
Toolkit for automated and rapid discovery of structural variants.
Methods. 2017 Oct 1;129:3-7. doi: 10.1016/j.ymeth.2017.05.030. Epub 2017 Jun 2.

引用本文的文献

1
Structural Variants: Mechanisms, Mapping, and Interpretation in Human Genetics.
Genes (Basel). 2025 Jul 29;16(8):905. doi: 10.3390/genes16080905.

本文引用的文献

1
Integrating Genetic Structural Variations and Whole-Genome Sequencing Into Clinical Neurology.
Neurol Genet. 2022 May 27;8(4):e200005. doi: 10.1212/NXG.0000000000200005. eCollection 2022 Aug.
2
Skewed X-Chromosome Inactivation as a Possible Marker of X-Linked CNV in Women with Pregnancy Loss.
Cytogenet Genome Res. 2022;162(3):97-108. doi: 10.1159/000524342. Epub 2022 May 30.
3
Towards accurate and reliable resolution of structural variants for clinical diagnosis.
Genome Biol. 2022 Mar 3;23(1):68. doi: 10.1186/s13059-022-02636-8.
5
Detecting cryptic clinically relevant structural variation in exome-sequencing data increases diagnostic yield for developmental disorders.
Am J Hum Genet. 2021 Nov 4;108(11):2186-2194. doi: 10.1016/j.ajhg.2021.09.010. Epub 2021 Oct 8.
6
Towards population-scale long-read sequencing.
Nat Rev Genet. 2021 Sep;22(9):572-587. doi: 10.1038/s41576-021-00367-3. Epub 2021 May 28.
8
Association of structural variation with cardiometabolic traits in Finns.
Am J Hum Genet. 2021 Apr 1;108(4):583-596. doi: 10.1016/j.ajhg.2021.03.008.
9
Expectations and blind spots for structural variation detection from long-read assemblies and short-read genome sequencing technologies.
Am J Hum Genet. 2021 May 6;108(5):919-928. doi: 10.1016/j.ajhg.2021.03.014. Epub 2021 Mar 30.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验