Suppr超能文献

NOREC4DNA:使用近最优无码率擦除码进行 DNA 存储。

NOREC4DNA: using near-optimal rateless erasure codes for DNA storage.

机构信息

Department of Mathematics and Computer Science, Philipps-Universität Marburg, 35032, Marburg, Germany.

出版信息

BMC Bioinformatics. 2021 Aug 17;22(1):406. doi: 10.1186/s12859-021-04318-x.

Abstract

BACKGROUND

DNA is a promising storage medium for high-density long-term digital data storage. Since DNA synthesis and sequencing are still relatively expensive tasks, the coding methods used to store digital data in DNA should correct errors and avoid unstable or error-prone DNA sequences. Near-optimal rateless erasure codes, also called fountain codes, are particularly interesting codes to realize high-capacity and low-error DNA storage systems, as shown by Erlich and Zielinski in their approach based on the Luby transform (LT) code. Since LT is the most basic fountain code, there is a large untapped potential for improvement in using near-optimal erasure codes for DNA storage.

RESULTS

We present NOREC4DNA, a software framework to use, test, compare, and improve near-optimal rateless erasure codes (NORECs) for DNA storage systems. These codes can effectively be used to store digital information in DNA and cope with the restrictions of the DNA medium. Additionally, they can adapt to possible variable lengths of DNA strands and have nearly zero overhead. We describe the design and implementation of NOREC4DNA. Furthermore, we present experimental results demonstrating that NOREC4DNA can flexibly be used to evaluate the use of NORECs in DNA storage systems. In particular, we show that NORECs that apparently have not yet been used for DNA storage, such as Raptor and Online codes, can achieve significant improvements over LT codes that were used in previous work. NOREC4DNA is available on https://github.com/umr-ds/NOREC4DNA .

CONCLUSION

NOREC4DNA is a flexible and extensible software framework for using, evaluating, and comparing NORECs for DNA storage systems.

摘要

背景

DNA 是一种有前途的高密度长期数字数据存储介质。由于 DNA 合成和测序仍然是相对昂贵的任务,因此用于在 DNA 中存储数字数据的编码方法应该纠正错误并避免不稳定或易错的 DNA 序列。近最优无速率擦除码,也称为喷泉码,是实现大容量、低错误 DNA 存储系统的特别有趣的码,正如 Erlich 和 Zielinski 在他们基于 Luby 变换 (LT) 码的方法中所展示的那样。由于 LT 是最基本的喷泉码,因此使用近最优的擦除码进行 DNA 存储具有很大的改进潜力。

结果

我们提出了 NOREC4DNA,这是一个用于使用、测试、比较和改进 DNA 存储系统的近最优无速率擦除码 (NORECs) 的软件框架。这些代码可以有效地用于在 DNA 中存储数字信息,并应对 DNA 介质的限制。此外,它们可以适应可能的 DNA 链的可变长度,并且几乎没有开销。我们描述了 NOREC4DNA 的设计和实现。此外,我们还展示了实验结果,证明了 NOREC4DNA 可以灵活地用于评估 NORECs 在 DNA 存储系统中的使用。特别是,我们表明,以前的工作中使用的 LT 码,如 Raptor 和 Online 码,明显尚未用于 DNA 存储的 NORECs 可以实现显著的改进。NOREC4DNA 可在 https://github.com/umr-ds/NOREC4DNA 上获得。

结论

NOREC4DNA 是一个灵活和可扩展的软件框架,用于使用、评估和比较 DNA 存储系统的 NORECs。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/f019b308456f/12859_2021_4318_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验