Suppr超能文献

用于DNA存储的隐藏寻址编码

Hidden Addressing Encoding for DNA Storage.

作者信息

Wang Penghao, Mu Ziniu, Sun Lijun, Si Shuqing, Wang Bin

机构信息

The Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian, China.

出版信息

Front Bioeng Biotechnol. 2022 Jul 19;10:916615. doi: 10.3389/fbioe.2022.916615. eCollection 2022.

Abstract

DNA is a natural storage medium with the advantages of high storage density and long service life compared with traditional media. DNA storage can meet the current storage requirements for massive data. Owing to the limitations of the DNA storage technology, the data need to be converted into short DNA sequences for storage. However, in the process, a large amount of physical redundancy will be generated to index short DNA sequences. To reduce redundancy, this study proposes a DNA storage encoding scheme with hidden addressing. Using the improved fountain encoding scheme, the index replaces part of the data to realize hidden addresses, and then, a 10.1 MB file is encoded with the hidden addressing. First, the Dottup dot plot generator and the Jaccard similarity coefficient analyze the overall self-similarity of the encoding sequence index, and then the sequence fragments of GC content are used to verify the performance of this scheme. The final results show that the encoding scheme indexes with overall lower self-similarity, and the local thermodynamic properties of the sequence are better. The hidden addressing encoding scheme proposed can not only improve the utilization of bases but also ensure the correct rate of DNA storage during the sequencing and decoding processes.

摘要

DNA是一种天然存储介质,与传统介质相比,具有存储密度高、使用寿命长的优点。DNA存储能够满足当前对海量数据的存储需求。由于DNA存储技术的局限性,数据需要被转换为短DNA序列进行存储。然而,在此过程中,为了索引短DNA序列会产生大量的物理冗余。为了减少冗余,本研究提出了一种具有隐藏寻址的DNA存储编码方案。利用改进的喷泉编码方案,索引替换部分数据以实现隐藏地址,然后,使用隐藏寻址对一个10.1MB的文件进行编码。首先,通过Dottup点图生成器和杰卡德相似系数分析编码序列索引的整体自相似性,然后使用GC含量的序列片段来验证该方案的性能。最终结果表明,该编码方案索引的整体自相似性较低,序列的局部热力学性质较好。所提出的隐藏寻址编码方案不仅可以提高碱基利用率,还能在测序和解码过程中保证DNA存储的正确率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4457/9344065/dea07740549b/fbioe-10-916615-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验