Zhang Shufang, Wu Jianjun, Huang Beibei, Liu Yuhong
School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072 China.
Computer Science and Engineering Department, Santa Clara University, Santa Clara, CA 95053 USA.
3 Biotech. 2021 Jul;11(7):328. doi: 10.1007/s13205-021-02882-w. Epub 2021 Jun 12.
The high-storage density, long-life cycle, and low-energy consumption of DNA molecules make it the future of next-generation storage technology. However, DNA storage has the disadvantages of high-synthesis cost and low-random access efficiency. A high-density DNA-coding scheme can effectively reduce the cost of DNA synthesis. This paper first proposes a DNA-mapping method based on codebook and a random access method for DNA information based on encoded content. The mapping method satisfies the two biological constraints of homopolymer length and GC content. The random access method can efficiently and selectively read specific files in the DNA pool. To increase storage density, convolutional neural networks are combined with mapping methods to generate base sequences. In the experiments, our method was compared with the results of existing DNA information storage methods, which showed that the proposed scheme has better information storage density.
DNA分子的高存储密度、长生命周期和低能耗使其成为下一代存储技术的未来发展方向。然而,DNA存储存在合成成本高和随机访问效率低的缺点。一种高密度DNA编码方案可以有效降低DNA合成成本。本文首先提出了一种基于码本的DNA映射方法和一种基于编码内容的DNA信息随机访问方法。该映射方法满足了同聚物长度和GC含量这两个生物学约束条件。随机访问方法可以在DNA池中高效且有选择地读取特定文件。为了提高存储密度,将卷积神经网络与映射方法相结合来生成碱基序列。在实验中,我们的方法与现有DNA信息存储方法的结果进行了比较,结果表明所提出的方案具有更好的信息存储密度。