State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu 210096, China.
Nucleic Acids Res. 2021 Jun 4;49(10):5451-5469. doi: 10.1093/nar/gkab230.
Deoxyribonucleic acid (DNA) has evolved to be a naturally selected, robust biomacromolecule for gene information storage, and biological evolution and various diseases can find their origin in uncertainties in DNA-related processes (e.g. replication and expression). Recently, synthetic DNA has emerged as a compelling molecular media for digital data storage, and it is superior to the conventional electronic memory devices in theoretical retention time, power consumption, storage density, and so forth. However, uncertainties in the in vitro DNA synthesis and sequencing, along with its conjugation chemistry and preservation conditions can lead to severe errors and data loss, which limit its practical application. To maintain data integrity, complicated error correction algorithms and substantial data redundancy are usually required, which can significantly limit the efficiency and scale-up of the technology. Herein, we summarize the general procedures of the state-of-the-art DNA-based digital data storage methods (e.g. write, read, and preservation), highlighting the uncertainties involved in each step as well as potential approaches to correct them. We also discuss challenges yet to overcome and research trends in the promising field of DNA-based data storage.
脱氧核糖核酸(DNA)已进化成为一种天然选择的、稳健的生物大分子,用于存储基因信息,并且生物进化和各种疾病都可以在与 DNA 相关过程(例如复制和表达)的不确定性中找到其起源。最近,合成 DNA 已成为一种引人注目的数字数据存储分子媒体,它在理论保留时间、功耗、存储密度等方面优于传统的电子存储设备。然而,体外 DNA 合成和测序、其缀合化学和保存条件中的不确定性会导致严重的错误和数据丢失,从而限制了其实际应用。为了保持数据完整性,通常需要复杂的纠错算法和大量的数据冗余,这会显著限制该技术的效率和扩展规模。在此,我们总结了最先进的基于 DNA 的数字数据存储方法的一般步骤(例如写入、读取和保存),强调了每个步骤中涉及的不确定性以及纠正它们的潜在方法。我们还讨论了该领域尚未克服的挑战和研究趋势。