Suppr超能文献

Smash++:一种无比对、节省内存的基因组重排分析工具。

Smash++: an alignment-free and memory-efficient tool to find genomic rearrangements.

机构信息

IEETA/DETI, University of Aveiro, Campus Universitário de Santiago, 3810-193 Aveiro, Portugal.

Department of Virology, University of Helsinki, Haartmaninkatu 3, 00014 Helsinki, Finland.

出版信息

Gigascience. 2020 May 1;9(5). doi: 10.1093/gigascience/giaa048.

Abstract

BACKGROUND

The development of high-throughput sequencing technologies and, as its result, the production of huge volumes of genomic data, has accelerated biological and medical research and discovery. Study on genomic rearrangements is crucial owing to their role in chromosomal evolution, genetic disorders, and cancer.

RESULTS

We present Smash++, an alignment-free and memory-efficient tool to find and visualize small- and large-scale genomic rearrangements between 2 DNA sequences. This computational solution extracts information contents of the 2 sequences, exploiting a data compression technique to find rearrangements. We also present Smash++ visualizer, a tool that allows the visualization of the detected rearrangements along with their self- and relative complexity, by generating an SVG (Scalable Vector Graphics) image.

CONCLUSIONS

Tested on several synthetic and real DNA sequences from bacteria, fungi, Aves, and Mammalia, the proposed tool was able to accurately find genomic rearrangements. The detected regions were in accordance with previous studies, which took alignment-based approaches or performed FISH (fluorescence in situ hybridization) analysis. The maximum peak memory usage among all experiments was ∼1 GB, which makes Smash++ feasible to run on present-day standard computers.

摘要

背景

高通量测序技术的发展及其产生的大量基因组数据加速了生物和医学研究与发现。由于基因组重排在染色体进化、遗传疾病和癌症中的作用,对其进行研究至关重要。

结果

我们提出了 Smash++,这是一种无比对且节省内存的工具,用于在 2 个 DNA 序列之间查找和可视化小范围和大范围的基因组重排。该计算解决方案利用数据压缩技术提取 2 个序列的信息内容,以查找重排。我们还介绍了 Smash++可视化工具,该工具允许通过生成 SVG(可伸缩矢量图形)图像来可视化检测到的重排及其自身和相对复杂性。

结论

在来自细菌、真菌、鸟类和哺乳动物的几个合成和真实 DNA 序列上进行测试,所提出的工具能够准确地找到基因组重排。检测到的区域与之前采用基于比对的方法或进行荧光原位杂交(FISH)分析的研究一致。所有实验中的最大峰值内存使用量约为 1GB,这使得 Smash++能够在当今标准计算机上运行。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a34c/7238676/503ddc5f9773/giaa048fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验