European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK.
Nature. 2013 Feb 7;494(7435):77-80. doi: 10.1038/nature11875. Epub 2013 Jan 23.
Digital production, transmission and storage have revolutionized how we access and use information but have also made archiving an increasingly complex task that requires active, continuing maintenance of digital media. This challenge has focused some interest on DNA as an attractive target for information storage because of its capacity for high-density information encoding, longevity under easily achieved conditions and proven track record as an information bearer. Previous DNA-based information storage approaches have encoded only trivial amounts of information or were not amenable to scaling-up, and used no robust error-correction and lacked examination of their cost-efficiency for large-scale information archival. Here we describe a scalable method that can reliably store more information than has been handled before. We encoded computer files totalling 739 kilobytes of hard-disk storage and with an estimated Shannon information of 5.2 × 10(6) bits into a DNA code, synthesized this DNA, sequenced it and reconstructed the original files with 100% accuracy. Theoretical analysis indicates that our DNA-based storage scheme could be scaled far beyond current global information volumes and offers a realistic technology for large-scale, long-term and infrequently accessed digital archiving. In fact, current trends in technological advances are reducing DNA synthesis costs at a pace that should make our scheme cost-effective for sub-50-year archiving within a decade.
数字生产、传输和存储彻底改变了我们获取和使用信息的方式,但也使得归档成为一项日益复杂的任务,需要对数字媒体进行积极、持续的维护。这一挑战使人们对 DNA 作为信息存储的有吸引力的目标产生了一些兴趣,因为它具有高密度信息编码的能力、在易于实现的条件下的长期耐久性,并且已经被证明是一种信息载体。以前基于 DNA 的信息存储方法仅能编码微不足道的信息量,或者不适合扩展,并且没有使用强大的纠错功能,也没有对其大规模信息归档的成本效益进行研究。在这里,我们描述了一种可扩展的方法,它可以可靠地存储比以前更多的信息。我们将总计 739KB 硬盘存储的计算机文件和估计为 5.2×10^6 位的香农信息量编码到 DNA 代码中,合成了这种 DNA,对其进行了测序,并以 100%的准确率重建了原始文件。理论分析表明,我们基于 DNA 的存储方案可以远远扩展到当前全球信息的规模,并为大规模、长期和不频繁访问的数字归档提供了一种现实的技术。事实上,技术进步的当前趋势正在以降低 DNA 合成成本的速度进行,这应该使我们的方案在十年内以低于 50 年的存档成本效益。