Suppr超能文献

基于 de Bruijn 图的从头组装新链实现 DNA 中稳健的数据存储。

Robust data storage in DNA by de Bruijn graph-based de novo strand assembly.

机构信息

Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, 300072, China.

School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China.

出版信息

Nat Commun. 2022 Sep 12;13(1):5361. doi: 10.1038/s41467-022-33046-w.

Abstract

DNA data storage is a rapidly developing technology with great potential due to its high density, long-term durability, and low maintenance cost. The major technical challenges include various errors, such as strand breaks, rearrangements, and indels that frequently arise during DNA synthesis, amplification, sequencing, and preservation. In this study, a de novo strand assembly algorithm (DBGPS) is developed using de Bruijn graph and greedy path search to meet these challenges. DBGPS shows substantial advantages in handling DNA breaks, rearrangements, and indels. The robustness of DBGPS is demonstrated by accelerated aging, multiple independent data retrievals, deep error-prone PCR, and large-scale simulations. Remarkably, 6.8 MB of data is accurately recovered from a severely corrupted sample that has been treated at 70 °C for 70 days. With DBGPS, we are able to achieve a logical density of 1.30 bits/cycle and a physical density of 295 PB/g.

摘要

DNA 数据存储是一项快速发展的技术,具有高存储密度、长期耐久性和低维护成本等优势,具有巨大的发展潜力。主要的技术挑战包括在 DNA 合成、扩增、测序和保存过程中经常出现的各种错误,如链断裂、重排和插入缺失。在这项研究中,我们开发了一种新的链组装算法(DBGPS),该算法使用 de Bruijn 图和贪婪路径搜索来应对这些挑战。DBGPS 在处理 DNA 断裂、重排和插入缺失方面具有显著的优势。通过加速老化、多次独立的数据检索、易错深度 PCR 和大规模模拟实验,验证了 DBGPS 的稳健性。值得注意的是,我们成功地从一个经过 70°C 处理 70 天的严重受损样本中准确恢复了 6.8MB 的数据。利用 DBGPS,我们实现了 1.30 位/循环的逻辑密度和 295PB/g 的物理密度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b9a/9468002/1067da34cb4c/41467_2022_33046_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验