School of Computer Science and Technology, Dalian University of Technology, 116024, Dalian, China.
School of Computer Science and Technology, Dalian University of Technology, 116024, Dalian, China.
Comput Biol Med. 2022 Dec;151(Pt A):106269. doi: 10.1016/j.compbiomed.2022.106269. Epub 2022 Nov 3.
Using complex biomolecules for storage is a new carbon-based storage method. For example, DNA has the potential to be a good method for archival long-term data storage. Reasonable and efficient coding is the first and most important step in DNA storage. However, current coding methods, such as altruism algorithm, have the problem of low coding efficiency and high complexity, and coding constraints and sets make it difficult to see the coding results visually. In this study, a new DNA storage coding method based on frequency matrix game graph (FMG) is proposed to generate DNA storage coding satisfying combinatorial constraints. Compared with the randomness of the heuristic algorithm that satisfies the constraints, the coding method based on the FMG is deterministic and can clearly explain the coding process. In addition, the constraints and coding results have observable characteristics and are better than the previously published results for the size of the coding set. For example, when length of the code n = 10, hamming distance d = 4, the results obtained by proposed approach combining chaos game and graph are 24% better than the previous results. The proposed coding scheme successfully constructs high-quality coding sets with less complexity, which effectively promotes the development of carbon-based storage coding.
使用复杂的生物分子进行存储是一种新的基于碳的存储方法。例如,DNA 有可能成为存档长期数据存储的好方法。合理高效的编码是 DNA 存储的第一步,也是最重要的一步。然而,当前的编码方法,如利他算法,存在编码效率低、复杂度高的问题,编码约束和集合使得难以直观地看到编码结果。在这项研究中,提出了一种基于频率矩阵博弈图 (FMG) 的新 DNA 存储编码方法,以生成满足组合约束的 DNA 存储编码。与满足约束的启发式算法的随机性相比,基于 FMG 的编码方法是确定性的,可以清晰地解释编码过程。此外,约束和编码结果具有可观察的特征,并且在编码集的大小方面优于先前发布的结果。例如,当码长 n = 10,汉明距离 d = 4 时,混沌博弈和图相结合的方法得到的结果比以前的结果要好 24%。所提出的编码方案成功地构建了具有较小复杂度的高质量编码集,有效地促进了基于碳的存储编码的发展。