Yang Xu, Shi Xiaolong, Lai Langwen, Chen Congzhou, Xu Huaisheng, Deng Ming
Institute of Computing Science and Technology, Guangzhou University, Guangzhou, China.
College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China.
Front Genet. 2023 Jun 13;14:1179867. doi: 10.3389/fgene.2023.1179867. eCollection 2023.
DNA has become a popular choice for next-generation storage media due to its high storage density and stability. As the storage medium of life's information, DNA has significant storage capacity and low-cost, low-power replication and transcription capabilities. However, utilizing long double-stranded DNA for storage can introduce unstable factors that make it difficult to meet the constraints of biological systems. To address this challenge, we have designed a highly robust coding scheme called the "random code system," inspired by the idea of fountain codes. The random code system includes the establishment of a random matrix, Gaussian preprocessing, and random equilibrium. Compared to Luby transform codes (LT codes), random code (RC) has better robustness and recovery ability of lost information. In biological experiments, we successfully stored 29,390 bits of data in 25,700 bp chains, achieving a storage density of 1.78 bits per nucleotide. These results demonstrate the potential for using long double-stranded DNA and the random code system for robust DNA-based data storage.
由于其高存储密度和稳定性,DNA已成为下一代存储介质的热门选择。作为生命信息的存储介质,DNA具有显著的存储容量以及低成本、低功耗的复制和转录能力。然而,利用长双链DNA进行存储会引入不稳定因素,使其难以满足生物系统的限制。为应对这一挑战,我们受喷泉码理念启发,设计了一种名为“随机码系统”的高度稳健编码方案。随机码系统包括建立随机矩阵、高斯预处理和随机均衡。与鲁比变换码(LT码)相比,随机码(RC)具有更好的稳健性和丢失信息恢复能力。在生物学实验中,我们成功地在25,700个碱基对的链中存储了29,390比特的数据,实现了每核苷酸1.78比特的存储密度。这些结果证明了使用长双链DNA和随机码系统进行稳健的基于DNA的数据存储的潜力。