Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
Broad Institute of MIT and Harvard, Cambridge, MA, USA.
Nat Mater. 2021 Sep;20(9):1272-1280. doi: 10.1038/s41563-021-01021-3. Epub 2021 Jun 10.
DNA is an ultrahigh-density storage medium that could meet exponentially growing worldwide demand for archival data storage if DNA synthesis costs declined sufficiently and if random access of files within exabyte-to-yottabyte-scale DNA data pools were feasible. Here, we demonstrate a path to overcome the second barrier by encapsulating data-encoding DNA file sequences within impervious silica capsules that are surface labelled with single-stranded DNA barcodes. Barcodes are chosen to represent file metadata, enabling selection of sets of files with Boolean logic directly, without use of amplification. We demonstrate random access of image files from a prototypical 2-kilobyte image database using fluorescence sorting with selection sensitivity of one in 10 files, which thereby enables one in 10 selection capability using N optical channels. Our strategy thereby offers a scalable concept for random access of archival files in large-scale molecular datasets.
DNA 是一种超高密度存储介质,如果 DNA 合成成本足够降低,并且能够在艾字节到尧字节规模的 DNA 数据池中实现文件的随机访问,那么它可以满足全球对档案数据存储呈指数级增长的需求。在这里,我们通过将数据编码 DNA 文件序列封装在不透气的二氧化硅胶囊内来克服第二个障碍,这些胶囊表面带有单链 DNA 条形码标签。条形码被选择来表示文件元数据,从而能够直接使用布尔逻辑选择文件集,而无需使用扩增。我们使用荧光分选技术对一个 2 千字节的原型图像数据库中的图像文件进行了随机访问,其选择灵敏度为每 10 个文件中选 1 个,从而可以使用 N 个光学通道实现每 10 个文件选 1 个的选择能力。因此,我们的策略为在大规模分子数据集的档案文件中实现随机访问提供了一种可扩展的概念。