• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

NOREC4DNA:使用近最优无码率擦除码进行 DNA 存储。

NOREC4DNA: using near-optimal rateless erasure codes for DNA storage.

机构信息

Department of Mathematics and Computer Science, Philipps-Universität Marburg, 35032, Marburg, Germany.

出版信息

BMC Bioinformatics. 2021 Aug 17;22(1):406. doi: 10.1186/s12859-021-04318-x.

DOI:10.1186/s12859-021-04318-x
PMID:34404355
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8371904/
Abstract

BACKGROUND

DNA is a promising storage medium for high-density long-term digital data storage. Since DNA synthesis and sequencing are still relatively expensive tasks, the coding methods used to store digital data in DNA should correct errors and avoid unstable or error-prone DNA sequences. Near-optimal rateless erasure codes, also called fountain codes, are particularly interesting codes to realize high-capacity and low-error DNA storage systems, as shown by Erlich and Zielinski in their approach based on the Luby transform (LT) code. Since LT is the most basic fountain code, there is a large untapped potential for improvement in using near-optimal erasure codes for DNA storage.

RESULTS

We present NOREC4DNA, a software framework to use, test, compare, and improve near-optimal rateless erasure codes (NORECs) for DNA storage systems. These codes can effectively be used to store digital information in DNA and cope with the restrictions of the DNA medium. Additionally, they can adapt to possible variable lengths of DNA strands and have nearly zero overhead. We describe the design and implementation of NOREC4DNA. Furthermore, we present experimental results demonstrating that NOREC4DNA can flexibly be used to evaluate the use of NORECs in DNA storage systems. In particular, we show that NORECs that apparently have not yet been used for DNA storage, such as Raptor and Online codes, can achieve significant improvements over LT codes that were used in previous work. NOREC4DNA is available on https://github.com/umr-ds/NOREC4DNA .

CONCLUSION

NOREC4DNA is a flexible and extensible software framework for using, evaluating, and comparing NORECs for DNA storage systems.

摘要

背景

DNA 是一种有前途的高密度长期数字数据存储介质。由于 DNA 合成和测序仍然是相对昂贵的任务,因此用于在 DNA 中存储数字数据的编码方法应该纠正错误并避免不稳定或易错的 DNA 序列。近最优无速率擦除码,也称为喷泉码,是实现大容量、低错误 DNA 存储系统的特别有趣的码,正如 Erlich 和 Zielinski 在他们基于 Luby 变换 (LT) 码的方法中所展示的那样。由于 LT 是最基本的喷泉码,因此使用近最优的擦除码进行 DNA 存储具有很大的改进潜力。

结果

我们提出了 NOREC4DNA,这是一个用于使用、测试、比较和改进 DNA 存储系统的近最优无速率擦除码 (NORECs) 的软件框架。这些代码可以有效地用于在 DNA 中存储数字信息,并应对 DNA 介质的限制。此外,它们可以适应可能的 DNA 链的可变长度,并且几乎没有开销。我们描述了 NOREC4DNA 的设计和实现。此外,我们还展示了实验结果,证明了 NOREC4DNA 可以灵活地用于评估 NORECs 在 DNA 存储系统中的使用。特别是,我们表明,以前的工作中使用的 LT 码,如 Raptor 和 Online 码,明显尚未用于 DNA 存储的 NORECs 可以实现显著的改进。NOREC4DNA 可在 https://github.com/umr-ds/NOREC4DNA 上获得。

结论

NOREC4DNA 是一个灵活和可扩展的软件框架,用于使用、评估和比较 DNA 存储系统的 NORECs。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/e170577ea969/12859_2021_4318_Fig17_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/f019b308456f/12859_2021_4318_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/9cefb7235c22/12859_2021_4318_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/434683d921dc/12859_2021_4318_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/3574162284b2/12859_2021_4318_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/b52b7ab022f5/12859_2021_4318_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/0e5579fda4b7/12859_2021_4318_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/f1ca32cd9567/12859_2021_4318_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/76ed9bc59c50/12859_2021_4318_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/fcc3fdb1446a/12859_2021_4318_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/01ab76977ecd/12859_2021_4318_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/ed9cbb473ab4/12859_2021_4318_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/98b35181683c/12859_2021_4318_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/ecf05bf99903/12859_2021_4318_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/7efc7150026d/12859_2021_4318_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/79cd31570cfc/12859_2021_4318_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/918f64f75df1/12859_2021_4318_Fig16_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/e170577ea969/12859_2021_4318_Fig17_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/f019b308456f/12859_2021_4318_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/9cefb7235c22/12859_2021_4318_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/434683d921dc/12859_2021_4318_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/3574162284b2/12859_2021_4318_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/b52b7ab022f5/12859_2021_4318_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/0e5579fda4b7/12859_2021_4318_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/f1ca32cd9567/12859_2021_4318_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/76ed9bc59c50/12859_2021_4318_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/fcc3fdb1446a/12859_2021_4318_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/01ab76977ecd/12859_2021_4318_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/ed9cbb473ab4/12859_2021_4318_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/98b35181683c/12859_2021_4318_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/ecf05bf99903/12859_2021_4318_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/7efc7150026d/12859_2021_4318_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/79cd31570cfc/12859_2021_4318_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/918f64f75df1/12859_2021_4318_Fig16_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7932/8371904/e170577ea969/12859_2021_4318_Fig17_HTML.jpg

相似文献

1
NOREC4DNA: using near-optimal rateless erasure codes for DNA storage.NOREC4DNA:使用近最优无码率擦除码进行 DNA 存储。
BMC Bioinformatics. 2021 Aug 17;22(1):406. doi: 10.1186/s12859-021-04318-x.
2
Multiple errors correction for position-limited DNA sequences with GC balance and no homopolymer for DNA-based data storage.用于基于DNA的数据存储的具有GC平衡且无同聚物的位置受限DNA序列的多重错误校正。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac484.
3
Design of packet erasure mitigation technique using a digital fountain code for wearable wireless body area networks.用于可穿戴无线体域网的基于数字喷泉码的分组擦除缓解技术设计
Annu Int Conf IEEE Eng Med Biol Soc. 2010;2010:356-9. doi: 10.1109/IEMBS.2010.5627695.
4
SDNC-Repair: A Cooperative Data Repair Strategy Based on Erasure Code for Software-Defined Storage.SDNC-Repair:一种基于纠删码的软件定义存储协同数据修复策略。
Sensors (Basel). 2023 Jun 22;23(13):5809. doi: 10.3390/s23135809.
5
Towards long double-stranded chains and robust DNA-based data storage using the random code system.迈向使用随机编码系统构建长双链链状结构及实现稳健的基于DNA的数据存储。
Front Genet. 2023 Jun 13;14:1179867. doi: 10.3389/fgene.2023.1179867. eCollection 2023.
6
Iterative Soft Decoding Algorithm for DNA Storage Using Quality Score and Redecoding.基于质量分数和重编码的 DNA 存储迭代软解码算法
IEEE Trans Nanobioscience. 2024 Jan;23(1):81-90. doi: 10.1109/TNB.2023.3284406. Epub 2024 Jan 3.
7
Minimum Free Energy Coding for DNA Storage.最小自由能编码用于 DNA 存储。
IEEE Trans Nanobioscience. 2021 Apr;20(2):212-222. doi: 10.1109/TNB.2021.3056351. Epub 2021 Mar 31.
8
MESA: automated assessment of synthetic DNA fragments and simulation of DNA synthesis, storage, sequencing and PCR errors.MESA:自动化评估合成 DNA 片段,并模拟 DNA 合成、存储、测序和 PCR 错误。
Bioinformatics. 2020 Jun 1;36(11):3322-3326. doi: 10.1093/bioinformatics/btaa140.
9
Cooperative sequence clustering and decoding for DNA storage system with fountain codes.具有喷泉码的DNA存储系统的协作序列聚类与解码
Bioinformatics. 2021 Oct 11;37(19):3136-3143. doi: 10.1093/bioinformatics/btab246.
10
Data recovery methods for DNA storage based on fountain codes.基于喷泉码的DNA存储数据恢复方法。
Comput Struct Biotechnol J. 2024 Apr 24;23:1808-1823. doi: 10.1016/j.csbj.2024.04.048. eCollection 2024 Dec.

引用本文的文献

1
Optimizing fountain codes for DNA data storage.优化用于DNA数据存储的喷泉码。
Comput Struct Biotechnol J. 2024 Oct 26;23:3878-3896. doi: 10.1016/j.csbj.2024.10.038. eCollection 2024 Dec.
2
Data recovery methods for DNA storage based on fountain codes.基于喷泉码的DNA存储数据恢复方法。
Comput Struct Biotechnol J. 2024 Apr 24;23:1808-1823. doi: 10.1016/j.csbj.2024.04.048. eCollection 2024 Dec.
3
Turbo autoencoders for the DNA data storage channel with Autoturbo-DNA.用于具有自动Turbo-DNA的DNA数据存储通道的Turbo自动编码器。

本文引用的文献

1
Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads.Natrix:一个基于 SnakeMake 的工作流程,用于处理、聚类和分类分配扩增子测序reads。
BMC Bioinformatics. 2020 Nov 16;21(1):526. doi: 10.1186/s12859-020-03852-4.
2
MESA: automated assessment of synthetic DNA fragments and simulation of DNA synthesis, storage, sequencing and PCR errors.MESA:自动化评估合成 DNA 片段,并模拟 DNA 合成、存储、测序和 PCR 错误。
Bioinformatics. 2020 Jun 1;36(11):3322-3326. doi: 10.1093/bioinformatics/btaa140.
3
A highly parallel strategy for storage of digital information in living cells.
iScience. 2024 Mar 27;27(5):109575. doi: 10.1016/j.isci.2024.109575. eCollection 2024 May 17.
4
RepairNatrix: a Snakemake workflow for processing DNA sequencing data for DNA storage.RepairNatrix:用于处理DNA存储的DNA测序数据的Snakemake工作流程。
Bioinform Adv. 2023 Aug 26;3(1):vbad117. doi: 10.1093/bioadv/vbad117. eCollection 2023.
5
A digital twin for DNA data storage based on comprehensive quantification of errors and biases.基于全面量化误差和偏差的 DNA 数据存储的数字孪生。
Nat Commun. 2023 Sep 27;14(1):6026. doi: 10.1038/s41467-023-41729-1.
6
Towards long double-stranded chains and robust DNA-based data storage using the random code system.迈向使用随机编码系统构建长双链链状结构及实现稳健的基于DNA的数据存储。
Front Genet. 2023 Jun 13;14:1179867. doi: 10.3389/fgene.2023.1179867. eCollection 2023.
7
Study on DNA Storage Encoding Based IAOA under Innovation Constraints.创新约束下基于IAOA的DNA存储编码研究
Curr Issues Mol Biol. 2023 Apr 18;45(4):3573-3590. doi: 10.3390/cimb45040233.
8
DNA-Aeon provides flexible arithmetic coding for constraint adherence and error correction in DNA storage.DNA-Aeon 为 DNA 存储提供了灵活的算术编码,以确保约束遵守和错误纠正。
Nat Commun. 2023 Feb 6;14(1):628. doi: 10.1038/s41467-023-36297-3.
一种在活细胞中存储数字信息的高度并行策略。
BMC Biotechnol. 2018 Oct 17;18(1):64. doi: 10.1186/s12896-018-0476-4.
4
DNA Fountain enables a robust and efficient storage architecture.DNA 喷泉实现了稳健且高效的存储架构。
Science. 2017 Mar 3;355(6328):950-954. doi: 10.1126/science.aaj2038.
5
Robust chemical preservation of digital information on DNA in silica with error-correcting codes.利用纠错码在硅基片上对 DNA 中的数字信息进行稳健的化学保存。
Angew Chem Int Ed Engl. 2015 Feb 16;54(8):2552-5. doi: 10.1002/anie.201411378. Epub 2015 Feb 4.
6
DNA based computing for understanding complex shapes.基于DNA的计算用于理解复杂形状。
Biosystems. 2014 Mar;117:40-53. doi: 10.1016/j.biosystems.2014.01.003. Epub 2014 Jan 18.
7
Mutation rates, spectra, and genome-wide distribution of spontaneous mutations in mismatch repair deficient yeast.错配修复缺陷酵母中自发突变的突变率、谱和全基因组分布。
G3 (Bethesda). 2013 Sep 4;3(9):1453-65. doi: 10.1534/g3.113.006429.
8
Towards practical, high-capacity, low-maintenance information storage in synthesized DNA.在合成 DNA 中实现实用、大容量、低维护的信息存储。
Nature. 2013 Feb 7;494(7435):77-80. doi: 10.1038/nature11875. Epub 2013 Jan 23.
9
Next-generation digital information storage in DNA.DNA 中的下一代数字信息存储。
Science. 2012 Sep 28;337(6102):1628. doi: 10.1126/science.1226355. Epub 2012 Aug 16.
10
DMSO and betaine greatly improve amplification of GC-rich constructs in de novo synthesis.DMSO 和甜菜碱极大地提高了从头合成中富含 GC 结构的扩增。
PLoS One. 2010 Jun 11;5(6):e11024. doi: 10.1371/journal.pone.0011024.