• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

FrameD:基于DNA的数据存储设计、验证和确认框架。

FrameD: framework for DNA-based data storage design, verification, and validation.

作者信息

Volkel Kevin D, Lin Kevin N, Hook Paul W, Timp Winston, Keung Albert J, Tuck James M

机构信息

Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC, 27606, United States.

Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC, 27695, United States.

出版信息

Bioinformatics. 2023 Oct 3;39(10). doi: 10.1093/bioinformatics/btad572.

DOI:10.1093/bioinformatics/btad572
PMID:37713474
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10563143/
Abstract

MOTIVATION

DNA-based data storage is a quickly growing field that hopes to harness the massive theoretical information density of DNA molecules to produce a competitive next-generation storage medium suitable for archival data. In recent years, many DNA-based storage system designs have been proposed. Given that no common infrastructure exists for simulating these storage systems, comparing many different designs along with many different error models is increasingly difficult. To address this challenge, we introduce FrameD, a simulation infrastructure for DNA storage systems that leverages the underlying modularity of DNA storage system designs to provide a framework to express different designs while being able to reuse common components.

RESULTS

We demonstrate the utility of FrameD and the need for a common simulation platform using a case study. Our case study compares designs that utilize strand copies differently, some that align strand copies using multiple sequence alignment algorithms and others that do not. We found that the choice to include multiple sequence alignment in the pipeline is dependent on the error rate and the type of errors being injected and is not always beneficial. In addition to supporting a wide range of designs, FrameD provides the user with transparent parallelism to deal with a large number of reads from sequencing and the need for many fault injection iterations. We believe that FrameD fills a void in the tools publicly available to the DNA storage community by providing a modular and extensible framework with support for massive parallelism. As a result, it will help accelerate the design process of future DNA-based storage systems.

AVAILABILITY AND IMPLEMENTATION

The source code for FrameD along with the data generated during the demonstration of FrameD is available in a public Github repository at https://github.com/dna-storage/framed, (https://dx.doi.org/10.5281/zenodo.7757762).

摘要

动机

基于DNA的数据存储是一个快速发展的领域,希望利用DNA分子巨大的理论信息密度来生产一种适用于存档数据的具有竞争力的下一代存储介质。近年来,已经提出了许多基于DNA的存储系统设计。鉴于不存在用于模拟这些存储系统的通用基础设施,比较许多不同的设计以及许多不同的错误模型变得越来越困难。为了应对这一挑战,我们引入了FrameD,这是一种用于DNA存储系统的模拟基础设施,它利用DNA存储系统设计的底层模块化来提供一个框架,以表达不同的设计,同时能够重用通用组件。

结果

我们通过一个案例研究展示了FrameD的实用性以及对通用模拟平台的需求。我们的案例研究比较了以不同方式利用链拷贝的设计,一些设计使用多序列比对算法来对齐链拷贝,而另一些则不使用。我们发现,在流程中包含多序列比对的选择取决于错误率和注入错误的类型,并不总是有益的。除了支持广泛的设计外,FrameD还为用户提供了透明的并行性,以处理来自测序的大量读取以及许多故障注入迭代的需求。我们相信,FrameD通过提供一个支持大规模并行性的模块化和可扩展框架,填补了DNA存储社区公开可用工具中的空白。因此,它将有助于加速未来基于DNA的存储系统的设计过程。

可用性和实现

FrameD的源代码以及在FrameD演示期间生成的数据可在公共Github存储库中获取,网址为https://github.com/dna-storage/framed,(https://dx.doi.org/10.5281/zenodo.7757762)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2de/10563143/ce1334d0b456/btad572f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2de/10563143/d50f59b13c6b/btad572f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2de/10563143/714edfba0e51/btad572f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2de/10563143/9f2bc6847af9/btad572f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2de/10563143/db7e672ed4fd/btad572f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2de/10563143/ce1334d0b456/btad572f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2de/10563143/d50f59b13c6b/btad572f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2de/10563143/714edfba0e51/btad572f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2de/10563143/9f2bc6847af9/btad572f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2de/10563143/db7e672ed4fd/btad572f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2de/10563143/ce1334d0b456/btad572f5.jpg

相似文献

1
FrameD: framework for DNA-based data storage design, verification, and validation.FrameD:基于DNA的数据存储设计、验证和确认框架。
Bioinformatics. 2023 Oct 3;39(10). doi: 10.1093/bioinformatics/btad572.
2
DeSP: a systematic DNA storage error simulation pipeline.DeSP:一种系统的 DNA 存储错误模拟管道。
BMC Bioinformatics. 2022 May 17;23(1):185. doi: 10.1186/s12859-022-04723-w.
3
GradHC: highly reliable gradual hash-based clustering for DNA storage systems.GradHC:用于 DNA 存储系统的高可靠基于渐进哈希的聚类。
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae274.
4
Cooperative sequence clustering and decoding for DNA storage system with fountain codes.具有喷泉码的DNA存储系统的协作序列聚类与解码
Bioinformatics. 2021 Oct 11;37(19):3136-3143. doi: 10.1093/bioinformatics/btab246.
5
Reducing cost in DNA-based data storage by sequence analysis-aided soft information decoding of variable-length reads.通过序列分析辅助的变长读取软信息解码来降低 DNA 数据存储成本。
Bioinformatics. 2023 Sep 2;39(9). doi: 10.1093/bioinformatics/btad548.
6
The future of Cochrane Neonatal.考克兰新生儿协作网的未来。
Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.
7
Alignment-free clustering of UMI tagged DNA molecules.无比对聚类分析 UMI 标签化 DNA 分子。
Bioinformatics. 2019 Jun 1;35(11):1829-1836. doi: 10.1093/bioinformatics/bty888.
8
NOREC4DNA: using near-optimal rateless erasure codes for DNA storage.NOREC4DNA:使用近最优无码率擦除码进行 DNA 存储。
BMC Bioinformatics. 2021 Aug 17;22(1):406. doi: 10.1186/s12859-021-04318-x.
9
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
10
Multiple errors correction for position-limited DNA sequences with GC balance and no homopolymer for DNA-based data storage.用于基于DNA的数据存储的具有GC平衡且无同聚物的位置受限DNA序列的多重错误校正。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac484.

本文引用的文献

1
DeSP: a systematic DNA storage error simulation pipeline.DeSP:一种系统的 DNA 存储错误模拟管道。
BMC Bioinformatics. 2022 May 17;23(1):185. doi: 10.1186/s12859-022-04723-w.
2
Promiscuous molecules for smarter file operations in DNA-based data storage.用于基于DNA的数据存储中更智能文件操作的混杂分子。
Nat Commun. 2021 Jun 10;12(1):3518. doi: 10.1038/s41467-021-23669-w.
3
DNA stability: a central design consideration for DNA data storage systems.DNA 稳定性:DNA 数据存储系统的核心设计考虑因素。
Nat Commun. 2021 Mar 1;12(1):1358. doi: 10.1038/s41467-021-21587-5.
4
Low cost DNA data storage using photolithographic synthesis and advanced information reconstruction and error correction.利用光刻合成以及先进的信息重构和纠错技术实现低成本 DNA 数据存储。
Nat Commun. 2020 Oct 22;11(1):5345. doi: 10.1038/s41467-020-19148-3.
5
SOLQC: Synthetic Oligo Library Quality Control tool.SOLQC:合成寡核苷酸文库质量控制工具。
Bioinformatics. 2021 May 5;37(5):720-722. doi: 10.1093/bioinformatics/btaa740.
6
HEDGES error-correcting code for DNA storage corrects indels and allows sequence constraints.用于 DNA 存储的 HEDGES 纠错码可纠正插入缺失,并允许序列约束。
Proc Natl Acad Sci U S A. 2020 Aug 4;117(31):18489-18496. doi: 10.1073/pnas.2004821117. Epub 2020 Jul 16.
7
Dynamic and scalable DNA-based information storage.基于 DNA 的动态可扩展信息存储。
Nat Commun. 2020 Jun 12;11(1):2981. doi: 10.1038/s41467-020-16797-2.
8
Probing the physical limits of reliable DNA data retrieval.探测可靠 DNA 数据检索的物理极限。
Nat Commun. 2020 Jan 30;11(1):616. doi: 10.1038/s41467-020-14319-8.
9
Data storage in DNA with fewer synthesis cycles using composite DNA letters.使用复合 DNA 字母减少合成循环的数据存储在 DNA 中。
Nat Biotechnol. 2019 Oct;37(10):1229-1236. doi: 10.1038/s41587-019-0240-x. Epub 2019 Sep 9.
10
Driving the Scalability of DNA-Based Information Storage Systems.推动基于DNA的信息存储系统的可扩展性。
ACS Synth Biol. 2019 Jun 21;8(6):1241-1248. doi: 10.1021/acssynbio.9b00100. Epub 2019 May 24.