Suppr超能文献

匹配项:k-mer 集的最小纯文本表示。

Matchtigs: minimum plain text representation of k-mer sets.

机构信息

Department of Computer Science, University of Helsinki, Helsinki, Finland.

Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, Roorkee, India.

出版信息

Genome Biol. 2023 Jun 9;24(1):136. doi: 10.1186/s13059-023-02968-z.

Abstract

We propose a polynomial algorithm computing a minimum plain-text representation of k-mer sets, as well as an efficient near-minimum greedy heuristic. When compressing read sets of large model organisms or bacterial pangenomes, with only a minor runtime increase, we shrink the representation by up to 59% over unitigs and 26% over previous work. Additionally, the number of strings is decreased by up to 97% over unitigs and 90% over previous work. Finally, a small representation has advantages in downstream applications, as it speeds up SSHash-Lite queries by up to 4.26× over unitigs and 2.10× over previous work.

摘要

我们提出了一种多项式算法,用于计算 k-mer 集的最小明文表示,以及一种高效的近似最小贪婪启发式算法。在压缩大型模式生物或细菌泛基因组的读取集时,仅略微增加运行时间,我们将表示缩小了 59%(相对于单元克)和 26%(相对于以前的工作)。此外,与单元克相比,字符串的数量减少了 97%,与以前的工作相比减少了 90%。最后,小的表示在下游应用中具有优势,因为它使 SSHash-Lite 查询的速度提高了 4.26 倍(相对于单元克)和 2.10 倍(相对于以前的工作)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3301/10251615/5cd3be41866a/13059_2023_2968_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验