Suppr超能文献

Simplitigs 作为一种高效且可扩展的 de Bruijn 图表示方法。

Simplitigs as an efficient and scalable representation of de Bruijn graphs.

机构信息

Department of Biomedical Informatics and Laboratory of Systems Pharmacology, Harvard Medical School, Boston, USA and Broad Institute of MIT and Harvard, Cambridge, USA.

Center for Communicable Disease Dynamics, Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, USA.

出版信息

Genome Biol. 2021 Apr 6;22(1):96. doi: 10.1186/s13059-021-02297-z.

Abstract

de Bruijn graphs play an essential role in bioinformatics, yet they lack a universal scalable representation. Here, we introduce simplitigs as a compact, efficient, and scalable representation, and ProphAsm, a fast algorithm for their computation. For the example of assemblies of model organisms and two bacterial pan-genomes, we compare simplitigs to unitigs, the best existing representation, and demonstrate that simplitigs provide a substantial improvement in the cumulative sequence length and their number. When combined with the commonly used Burrows-Wheeler Transform index, simplitigs reduce memory, and index loading and query times, as demonstrated with large-scale examples of GenBank bacterial pan-genomes.

摘要

de Bruijn 图在生物信息学中起着至关重要的作用,但它们缺乏通用的可扩展表示。在这里,我们引入了 simplitigs 作为一种紧凑、高效和可扩展的表示形式,并介绍了 ProphAsm 算法,用于快速计算它们。以模式生物和两个细菌泛基因组组装的例子为例,我们将 simplitigs 与 unitigs(现有的最佳表示形式)进行了比较,并证明了 simplitigs 在累积序列长度和数量上有了显著的提高。当与常用的 Burrows-Wheeler Transform 索引结合使用时,simplitigs 减少了内存和索引加载和查询时间,这在 GenBank 细菌泛基因组的大规模示例中得到了验证。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5244/8025321/10d1c839abbd/13059_2021_2297_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验