Suppr超能文献

基于质量分数和重编码的 DNA 存储迭代软解码算法

Iterative Soft Decoding Algorithm for DNA Storage Using Quality Score and Redecoding.

出版信息

IEEE Trans Nanobioscience. 2024 Jan;23(1):81-90. doi: 10.1109/TNB.2023.3284406. Epub 2024 Jan 3.

Abstract

Ever since deoxyribonucleic acid (DNA) was considered as a next-generation data-storage medium, lots of research efforts have been made to correct errors occurred during the synthesis, storage, and sequencing processes using error correcting codes (ECCs). Previous works on recovering the data from the sequenced DNA pool with errors have utilized hard decoding algorithms based on a majority decision rule. To improve the correction capability of ECCs and robustness of the DNA storage system, we propose a new iterative soft decoding algorithm, where soft information is obtained from FASTQ files and channel statistics. In particular, we propose a new formula for log-likelihood ratio (LLR) calculation using quality scores (Q-scores) and a redecoding method which may be suitable for the error correction and detection in the DNA sequencing area. Based on the widely adopted encoding scheme of the fountain code structure proposed by Erlich et al., we use three different sets of sequenced data to show consistency for the performance evaluation. The proposed soft decoding algorithm gives 2.3%  ∼  7.0% improvement of the reading number reduction compared to the state-of-the-art decoding method and it is shown that it can deal with erroneous sequenced oligo reads with insertion and deletion errors.

摘要

自从脱氧核糖核酸(DNA)被认为是下一代数据存储介质以来,已经有很多研究工作致力于使用纠错码(ECC)来纠正合成、存储和测序过程中发生的错误。以前在存在错误的情况下从测序 DNA 池中恢复数据的工作利用了基于多数决策规则的硬解码算法。为了提高 ECC 的纠错能力和 DNA 存储系统的鲁棒性,我们提出了一种新的迭代软解码算法,其中软信息是从 FASTQ 文件和信道统计中获得的。特别是,我们提出了一种使用质量分数(Q 分数)计算似然比(LLR)的新公式,以及一种可能适用于 DNA 测序领域纠错和检测的重新解码方法。基于 Erlich 等人提出的喷泉码结构的广泛采用的编码方案,我们使用三组不同的测序数据进行性能评估,以显示一致性。与最先进的解码方法相比,所提出的软解码算法可以将读取次数减少 2.3%∼7.0%,并且可以处理具有插入和删除错误的错误测序寡核苷酸读取。

相似文献

6
Random access in large-scale DNA data storage.大规模 DNA 数据存储中的随机访问。
Nat Biotechnol. 2018 Mar;36(3):242-248. doi: 10.1038/nbt.4079. Epub 2018 Feb 19.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验