School of Computer Science, Key Laboratory of High Confidence Software Technologies, Peking University, Beijing, China.
Center for Quantitative Biology, Peking University, Beijing, China.
Nature. 2024 Oct;634(8035):824-832. doi: 10.1038/s41586-024-08040-5. Epub 2024 Oct 23.
DNA storage has shown potential to transcend current silicon-based data storage technologies in storage density, longevity and energy consumption. However, writing large-scale data directly into DNA sequences by de novo synthesis remains uneconomical in time and cost. We present an alternative, parallel strategy that enables the writing of arbitrary data on DNA using premade nucleic acids. Through self-assembly guided enzymatic methylation, epigenetic modifications, as information bits, can be introduced precisely onto universal DNA templates to enact molecular movable-type printing. By programming with a finite set of 700 DNA movable types and five templates, we achieved the synthesis-free writing of approximately 275,000 bits on an automated platform with 350 bits written per reaction. The data encoded in complex epigenetic patterns were retrieved high-throughput by nanopore sequencing, and algorithms were developed to finely resolve 240 modification patterns per sequencing reaction. With the epigenetic information bits framework, distributed and bespoke DNA storage was implemented by 60 volunteers lacking professional biolab experience. Our framework presents a new modality of DNA data storage that is parallel, programmable, stable and scalable. Such an unconventional modality opens up avenues towards practical data storage and dual-mode data functions in biomolecular systems.
DNA 存储在存储密度、寿命和能耗方面显示出超越当前硅基数据存储技术的潜力。然而,通过从头合成将大规模数据直接写入 DNA 序列在时间和成本上仍然不经济。我们提出了一种替代的并行策略,使用预制的核酸在 DNA 上写入任意数据。通过酶促甲基化的自组装指导,表观遗传修饰作为信息位,可以精确地引入到通用的 DNA 模板上,从而实现分子活字印刷。通过用有限的 700 种 DNA 活字和 5 个模板进行编程,我们在一个自动化平台上实现了大约 275000 位的无合成写入,每个反应写入 350 位。通过纳米孔测序以高通量方式检索到复杂的表观遗传模式中编码的数据,并开发了算法来精细解析每个测序反应中的 240 个修饰模式。通过表观遗传信息位框架,由 60 名缺乏专业生物实验室经验的志愿者实现了分布式和定制的 DNA 存储。我们的框架提出了一种新的 DNA 数据存储模式,该模式具有并行、可编程、稳定和可扩展的特点。这种非传统的模式为生物分子系统中的实用数据存储和双模数据功能开辟了途径。