• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

DUHI:用于 DNA 存储的动态更新哈希索引聚类方法。

DUHI: Dynamically updated hash index clustering method for DNA storage.

机构信息

Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, 116622, Dalian, China.

School of Computer Science and Technology, Dalian University of Technology, 116024, Dalian, China.

出版信息

Comput Biol Med. 2023 Sep;164:107244. doi: 10.1016/j.compbiomed.2023.107244. Epub 2023 Jul 11.

DOI:10.1016/j.compbiomed.2023.107244
PMID:37453377
Abstract

The exponential growth of global data leads to the problem of insufficient data storage capacity. DNA storage can be an ideal storage method due to its high storage density and long storage time. However, the DNA storage process is subject to unavoidable errors that can lead to increased cluster redundancy during data reading, which in turn affects the accuracy of the data reads. This paper proposes a dynamically updated hash index (DUHI) clustering method for DNA storage, which clusters sequences by constructing a dynamic core index set and using hash lookup. The proposed clustering method is analyzed in terms of overall reliability evaluation and visualization evaluation. The results show that the DUHI clustering method can reduce the redundancy of more than 10% of the sequences within the cluster and increase the reconstruction rate of the sequences to more than 99%. Therefore, our method solves the high redundancy problem after DNA sequence clustering, improves the accuracy of data reading, and promotes the development of DNA storage.

摘要

全球数据呈指数级增长,导致数据存储容量不足的问题。由于存储密度高、存储时间长,DNA 存储可以成为一种理想的存储方法。然而,DNA 存储过程中会不可避免地出现错误,这可能导致数据读取过程中簇的冗余增加,从而影响数据读取的准确性。本文提出了一种用于 DNA 存储的动态更新哈希索引 (DUHI) 聚类方法,该方法通过构建动态核心索引集并使用哈希查找对序列进行聚类。对所提出的聚类方法进行了总体可靠性评估和可视化评估。结果表明,DUHI 聚类方法可以减少簇内超过 10%的序列的冗余,并将序列的重建率提高到 99%以上。因此,我们的方法解决了 DNA 序列聚类后的高冗余问题,提高了数据读取的准确性,促进了 DNA 存储的发展。

相似文献

1
DUHI: Dynamically updated hash index clustering method for DNA storage.DUHI:用于 DNA 存储的动态更新哈希索引聚类方法。
Comput Biol Med. 2023 Sep;164:107244. doi: 10.1016/j.compbiomed.2023.107244. Epub 2023 Jul 11.
2
GradHC: highly reliable gradual hash-based clustering for DNA storage systems.GradHC:用于 DNA 存储系统的高可靠基于渐进哈希的聚类。
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae274.
3
Clover: tree structure-based efficient DNA clustering for DNA-based data storage.三叶草:基于树结构的高效 DNA 聚类在基于 DNA 的数据存储中的应用。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac336.
4
Parallel hash-based EST clustering algorithm for gene sequencing.用于基因测序的基于哈希的并行EST聚类算法
DNA Cell Biol. 2004 Oct;23(10):615-23. doi: 10.1089/dna.2004.23.615.
5
Efficient filtering methods for clustering cDNAs with spliced sequence alignment.用于通过剪接序列比对对cDNA进行聚类的高效过滤方法。
Bioinformatics. 2004 Jan 1;20(1):29-39. doi: 10.1093/bioinformatics/btg367.
6
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
7
Query-Adaptive Reciprocal Hash Tables for Nearest Neighbor Search.查询自适应互哈希表用于最近邻搜索。
IEEE Trans Image Process. 2016 Feb;25(2):907-19. doi: 10.1109/TIP.2015.2505180. Epub 2015 Dec 3.
8
Perfect Hamming code with a hash table for faster genome mapping.使用哈希表的完美汉明码以加快基因组映射。
BMC Genomics. 2011 Nov 30;12 Suppl 3(Suppl 3):S8. doi: 10.1186/1471-2164-12-S3-S8.
9
SEED: efficient clustering of next-generation sequences.SEED:下一代序列的高效聚类。
Bioinformatics. 2011 Sep 15;27(18):2502-9. doi: 10.1093/bioinformatics/btr447. Epub 2011 Aug 2.
10
Design and Application of Deep Hash Embedding Algorithm with Fusion Entity Attribute Information.融合实体属性信息的深度哈希嵌入算法的设计与应用
Entropy (Basel). 2023 Feb 15;25(2):361. doi: 10.3390/e25020361.

引用本文的文献

1
Levy Sooty Tern Optimization Algorithm Builds DNA Storage Coding Sets for Random Access.列维乌燕鸥优化算法构建用于随机访问的DNA存储编码集。
Entropy (Basel). 2024 Sep 11;26(9):778. doi: 10.3390/e26090778.
2
On secondary structure avoidance of codes for DNA storage.关于DNA存储编码的二级结构避免问题。
Comput Struct Biotechnol J. 2023 Nov 29;23:140-147. doi: 10.1016/j.csbj.2023.11.035. eCollection 2024 Dec.