用于基于肽的数据存储的复合映射，具有更高的编码密度和更少的合成循环。

Composite Mapping for Peptide-Based Data Storage with Higher Coding Density and Fewer Synthesis Cycles.

作者信息

Zhang Anxun, Wang Longjie, Zhai Xiaowei, Xiao Yao, Wu Yanchan, Zhao Yongxi, Liu Kai, Zheng Ji-Shen, Chen Dong

机构信息

Department of Medical Oncology, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, Zhejiang, 310003, P. R. China.

College of Energy Engineering and State Key Laboratory of Clean Energy Utilization, Zhejiang University, Hangzhou, Zhejiang, 310003, P. R. China.

出版信息

Adv Sci (Weinh). 2025 Jul;12(27):e2503790. doi: 10.1002/advs.202503790. Epub 2025 Apr 26.

DOI:10.1002/advs.202503790

PMID:40285644

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12279161/

Abstract

Peptides are natural information-bearing mediums and are promising for high-density data storage. However, conventional mapping of one amino acid (AA) to one binary code has limited the improvement of coding density by increasing the total number of different AAs. Here, a novel composite mapping strategy is developed, where each position in the peptide sequence is a composite letter consisting of several different AAs, and thousands of composite letters are available for mapping, thus breaking the limit of conventional mapping. When 20 different AAs are used, the coding density of six-AAs composite mapping achieves 15 bits/letter, while conventional mapping only reaches 4 bits/AA. The whole process of encoding data into composite letter sequences, synthesizing composite letter sequences via solid-phase peptide synthesis, sequencing composite letter sequences by mass spectrometry, and decoding data from composite letter sequences is successfully demonstrated for the first time. Composite mapping also demonstrates several distinct advantages, including high coding density, few synthesis cycles, high reliability against errors, low probability of homopolymers, and good compatibility with other encoding algorithms. The developed composite mapping strategy provides a novel way for peptide-based data storage to increase the coding density and reduce the synthesis cycles, showing great potential for large-scale data storage.

摘要

肽是天然的信息承载介质，在高密度数据存储方面具有潜力。然而，传统的将一个氨基酸（AA）映射为一个二进制代码的方式，通过增加不同AA的总数来提高编码密度的能力有限。在此，开发了一种新颖的复合映射策略，其中肽序列中的每个位置是一个由几个不同AA组成的复合字母，并且有数千个复合字母可用于映射，从而突破了传统映射的限制。当使用20种不同的AA时，六氨基酸复合映射的编码密度达到15比特/字母，而传统映射仅达到4比特/AA。首次成功展示了将数据编码为复合字母序列、通过固相肽合成合成复合字母序列、通过质谱对复合字母序列进行测序以及从复合字母序列解码数据的全过程。复合映射还展示了几个明显的优点，包括高编码密度、较少的合成循环、高抗错误可靠性、低同聚物概率以及与其他编码算法的良好兼容性。所开发的复合映射策略为基于肽的数据存储提供了一种增加编码密度和减少合成循环的新方法，在大规模数据存储方面显示出巨大潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed0c/12279161/5158473d5926/ADVS-12-2503790-g002.jpg

相似文献

Composite Mapping for Peptide-Based Data Storage with Higher Coding Density and Fewer Synthesis Cycles.用于基于肽的数据存储的复合映射，具有更高的编码密度和更少的合成循环。

Adv Sci (Weinh). 2025 Jul;12(27):e2503790. doi: 10.1002/advs.202503790. Epub 2025 Apr 26.

Short-Term Memory Impairment短期记忆障碍

Decoding the impact of neighboring amino acids on ESI-MS intensity output through deep learning.通过深度学习解码邻近氨基酸对 ESI-MS 强度输出的影响。

J Proteomics. 2024 Oct 30;309:105322. doi: 10.1016/j.jprot.2024.105322. Epub 2024 Sep 26.

[Research progress of peptide recognition-guided strategies for exosome isolation and enrichment].[基于肽识别的外泌体分离与富集策略的研究进展]

Se Pu. 2025 May;43(5):446-454. doi: 10.3724/SP.J.1123.2024.10015.

Effective IDS Error Correction Algorithms for DNA Storage Channels With Multiple Output Sequences.适用于具有多个输出序列的DNA存储通道的有效IDS纠错算法。

IEEE Trans Nanobioscience. 2025 Jul;24(3):386-394. doi: 10.1109/TNB.2025.3558853.

A rapid and systematic review of the clinical effectiveness and cost-effectiveness of topotecan for ovarian cancer.拓扑替康治疗卵巢癌的临床有效性和成本效益的快速系统评价。

Health Technol Assess. 2001;5(28):1-110. doi: 10.3310/hta5280.

A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.对紫杉醇、多西他赛、吉西他滨和长春瑞滨在非小细胞肺癌中的临床疗效和成本效益进行的快速系统评价。

Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.

The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》

Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.

[Volume and health outcomes: evidence from systematic reviews and from evaluation of Italian hospital data].[容量与健康结果：来自系统评价和意大利医院数据评估的证据]

Epidemiol Prev. 2013 Mar-Jun;37(2-3 Suppl 2):1-100.

Sexual Harassment and Prevention Training性骚扰与预防培训

本文引用的文献

Digital Barcodes for High-Throughput Screening.用于高通量筛选的数字条形码

Chem Bio Eng. 2024 Jan 26;1(1):2-12. doi: 10.1021/cbe.3c00085. eCollection 2024 Feb 22.

Highly reliable and efficient encoding systems for hexadecimal polypeptide-based data storage.用于基于十六进制多肽的数据存储的高度可靠且高效的编码系统。

Fundam Res. 2021 Dec 18;3(2):298-304. doi: 10.1016/j.fmre.2021.11.030. eCollection 2023 Mar.

Advances in Ultrahigh Throughput Hit Discovery with Tandem Mass Spectrometry Encoded Libraries.串联质谱编码文库在超高通量命中发现方面的进展。

J Am Chem Soc. 2023 Aug 30;145(34):19129-19139. doi: 10.1021/jacs.3c04899. Epub 2023 Aug 9.

A mirror-image protein-based information barcoding and storage technology.一种基于镜像蛋白质的信息条形码和存储技术。

Sci Bull (Beijing). 2021 Aug 15;66(15):1542-1549. doi: 10.1016/j.scib.2021.03.010. Epub 2021 Mar 13.

In vivo processing of digital information molecularly with targeted specificity and robust reliability.在体内以靶向特异性和强大的可靠性对数字信息进行分子处理。

Sci Adv. 2022 Aug 5;8(31):eabo7415. doi: 10.1126/sciadv.abo7415.

DNA storage: research landscape and future prospects.DNA存储：研究现状与未来前景。

Natl Sci Rev. 2020 Jun;7(6):1092-1107. doi: 10.1093/nsr/nwaa007. Epub 2020 Jan 21.

Data storage using peptide sequences.使用肽序列进行数据存储。

Nat Commun. 2021 Jul 13;12(1):4242. doi: 10.1038/s41467-021-24496-9.

Significantly Improving the Bioefficacy for Rheumatoid Arthritis with Supramolecular Nanoformulations.超分子纳米制剂显著提高类风湿关节炎的生物疗效。

Adv Mater. 2021 Apr;33(16):e2100098. doi: 10.1002/adma.202100098. Epub 2021 Mar 17.

PRESnovo: Prescreening Prior to Sequencing to Improve Accuracy and Sensitivity of Neuropeptide Identification.PRESnovo：测序前的预筛选，以提高神经肽鉴定的准确性和灵敏度。

J Am Soc Mass Spectrom. 2020 Jul 1;31(7):1358-1371. doi: 10.1021/jasms.0c00013. Epub 2020 Apr 26.

Bioinspired and Mechanically Strong Fibers Based on Engineered Non-Spider Chimeric Proteins.基于工程化非蜘蛛嵌合蛋白的仿生且机械性能优异的纤维。

Angew Chem Int Ed Engl. 2020 May 18;59(21):8148-8152. doi: 10.1002/anie.202002399. Epub 2020 Mar 17.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于基于肽的数据存储的复合映射，具有更高的编码密度和更少的合成循环。

Composite Mapping for Peptide-Based Data Storage with Higher Coding Density and Fewer Synthesis Cycles.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献