Suppr超能文献

一种用于文本DNA存储的分层纠错策略。

A Hierarchical Error Correction Strategy for Text DNA Storage.

作者信息

Zan Xiangzhen, Yao Xiangyu, Xu Peng, Chen Zhihua, Xie Lian, Li Shudong, Liu Wenbin

机构信息

Institution of Computational Science and Technology, Guangzhou University, Guangzhou, 510006, China.

Institution of Huangpu Research, Guangzhou University, Guangzhou, 510006, China.

出版信息

Interdiscip Sci. 2022 Mar;14(1):141-150. doi: 10.1007/s12539-021-00476-x. Epub 2021 Aug 31.

Abstract

DNA storage has been a thriving interdisciplinary research area because of its high density, low maintenance cost, and long durability for information storage. However, the complexity of errors in DNA sequences including substitutions, insertions and deletions hinders its application for massive data storage. Motivated by the divide-and-conquer algorithm, we propose a hierarchical error correction strategy for text DNA storage. The basic idea is to design robust codes for common characters which have one-base error correction ability including insertion and/or deletion. The errors are gradually corrected by the codes in DNA reads, multiple alignment of character lines, and finally word spelling. On one hand, the proposed encoding method provides a systematic way to design storage friendly codes, such as 50% GC content, no more than 2-base homopolymers, and robustness against secondary structures. On the other hand, the proposed error correction method not only corrects single insertion or deletion, but also deals with multiple insertions or deletions. Simulation results demonstrate that the proposed method can correct more than 98% errors when error rate is less than or equal to 0.05. Thus, it is more powerful and adaptable to the complicated DNA storage applications.

摘要

由于DNA具有高密度、低维护成本以及信息存储的长耐久性,DNA存储一直是一个蓬勃发展的跨学科研究领域。然而,DNA序列中包括替换、插入和缺失在内的错误复杂性阻碍了其在海量数据存储中的应用。受分治法的启发,我们提出了一种用于文本DNA存储的分层纠错策略。其基本思想是为具有单碱基纠错能力(包括插入和/或缺失)的常见字符设计鲁棒码。通过DNA读取中的码、字符行的多重比对以及最终的单词拼写,逐步纠正错误。一方面,所提出的编码方法提供了一种系统的方式来设计对存储友好的码,例如50%的GC含量、不超过2个碱基的同聚物以及对二级结构的鲁棒性。另一方面,所提出的纠错方法不仅可以纠正单个插入或缺失,还能处理多个插入或缺失。仿真结果表明,当错误率小于或等于0.05时,所提出的方法能够纠正超过98%的错误。因此,它更强大且适用于复杂的DNA存储应用。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验