Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, 7 Pengfei Street Dapeng New District, Shenzhen, 518120, P. R. China.
National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen, 518060, P. R. China.
Adv Sci (Weinh). 2024 Aug;11(30):e2402951. doi: 10.1002/advs.202402951. Epub 2024 Jun 14.
Composite DNA letters, by merging all four DNA nucleotides in specified ratios, offer a pathway to substantially increase the logical density of DNA digital storage (DDS) systems. However, these letters are susceptible to nucleotide errors and sampling bias, leading to a high letter error rate, which complicates precise data retrieval and augments reading expenses. To address this, Derrick-cp is introduced as an innovative soft-decision decoding algorithm tailored for DDS utilizing composite letters. Derrick-cp capitalizes on the distinctive error sensitivities among letters to accurately predict and rectify letter errors, thus enhancing the error-correcting performance of Reed-Solomon codes beyond traditional hard-decision decoding limits. Through comparative analyses in the existing dataset and simulated experiments, Derrick-cp's superiority is validated, notably halving the sequencing depth requirement and slashing costs by up to 22% against conventional hard-decision strategies. This advancement signals Derrick-cp's significant role in elevating both the precision and cost-efficiency of composite letter-based DDS.
复合 DNA 字母通过将所有四个 DNA 核苷酸按特定比例混合,提供了一条途径,可以大大提高 DNA 数字存储 (DDS) 系统的逻辑密度。然而,这些字母容易受到核苷酸错误和采样偏差的影响,导致高字母错误率,这使得精确的数据检索变得复杂,并增加了读取成本。为了解决这个问题,引入了 Derrick-cp 作为一种针对使用复合字母的 DDS 的创新软判决解码算法。Derrick-cp 利用字母之间独特的错误敏感性,准确地预测和纠正字母错误,从而提高 Reed-Solomon 码的纠错性能,超越传统硬判决解码的限制。通过在现有数据集和模拟实验中的比较分析,验证了 Derrick-cp 的优越性,特别是测序深度要求减半,与传统硬判决策略相比,成本降低了 22%。这一进展表明 Derrick-cp 在提高基于复合字母的 DDS 的精度和成本效率方面发挥了重要作用。