Suppr超能文献

多序列比对算法(MAFFT)在 DNA 存储中纠错能力的研究。

Study of the error correction capability of multiple sequence alignment algorithm (MAFFT) in DNA storage.

机构信息

Institution of Computational Science and Technology, Guangzhou University, Guangzhou, 510006, China.

出版信息

BMC Bioinformatics. 2023 Mar 23;24(1):111. doi: 10.1186/s12859-023-05237-9.

Abstract

Synchronization (insertions-deletions) errors are still a major challenge for reliable information retrieval in DNA storage. Unlike traditional error correction codes (ECC) that add redundancy in the stored information, multiple sequence alignment (MSA) solves this problem by searching the conserved subsequences. In this paper, we conduct a comprehensive simulation study on the error correction capability of a typical MSA algorithm, MAFFT. Our results reveal that its capability exhibits a phase transition when there are around 20% errors. Below this critical value, increasing sequencing depth can eventually allow it to approach complete recovery. Otherwise, its performance plateaus at some poor levels. Given a reasonable sequencing depth (≤ 70), MSA could achieve complete recovery in the low error regime, and effectively correct 90% of the errors in the medium error regime. In addition, MSA is robust to imperfect clustering. It could also be combined with other means such as ECC, repeated markers, or any other code constraints. Furthermore, by selecting an appropriate sequencing depth, this strategy could achieve an optimal trade-off between cost and reading speed. MSA could be a competitive alternative for future DNA storage.

摘要

同步(插入-缺失)错误仍然是 DNA 存储中可靠信息检索的主要挑战。与在存储信息中添加冗余的传统纠错码 (ECC) 不同,多重序列比对 (MSA) 通过搜索保守的子序列来解决此问题。在本文中,我们对一种典型的 MSA 算法 MAFFT 的纠错能力进行了全面的模拟研究。我们的结果表明,当存在约 20%的错误时,其能力会发生相变。在这个临界值以下,增加测序深度最终可以使其接近完全恢复。否则,其性能会在某些较差的水平上趋于平稳。在合理的测序深度(≤70)下,MSA 可以在低误差范围内实现完全恢复,并有效地纠正中误差范围内的 90%的错误。此外,MSA 对不完全聚类具有鲁棒性。它还可以与其他手段(如 ECC、重复标记或任何其他代码约束)结合使用。此外,通过选择适当的测序深度,这种策略可以在成本和读取速度之间实现最佳的权衡。MSA 可能成为未来 DNA 存储的一种有竞争力的选择。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0faa/10037887/25beb7901b8d/12859_2023_5237_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验