• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Relative Suffix Trees.

作者信息

Farruggia Andrea, Gagie Travis, Navarro Gonzalo, Puglisi Simon J, Sirén Jouni

机构信息

Department of Computer Science, University of Pisa, Largo Bruno Pontecorvo 3, 56127 Pisa PI, Italy.

CeBiB-Center for Biotechnology and Bioengineering, Santiago, Chile.

出版信息

Comput J. 2018 May;61(5):773-788. doi: 10.1093/comjnl/bxx108. Epub 2017 Nov 21.

DOI:10.1093/comjnl/bxx108
PMID:29795706
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5956352/
Abstract

Suffix trees are one of the most versatile data structures in stringology, with many applications in bioinformatics. Their main drawback is their size, which can be tens of times larger than the input sequence. Much effort has been put into reducing the space usage, leading ultimately to compressed suffix trees. These compressed data structures can efficiently simulate the suffix tree, while using space proportional to a compressed representation of the sequence. In this work, we take a new approach to compressed suffix trees for repetitive sequence collections, such as collections of individual genomes. We compress the suffix trees of individual sequences relative to the suffix tree of a reference sequence. These relative data structures provide competitive time/space trade-offs, being almost as small as the smallest compressed suffix trees for repetitive collections, and competitive in time with the largest and fastest compressed suffix trees.

摘要
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/113c/5956352/972603a090ee/bxx108f03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/113c/5956352/de13679151fa/bxx108f01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/113c/5956352/1f39e2d0347e/bxx108f02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/113c/5956352/972603a090ee/bxx108f03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/113c/5956352/de13679151fa/bxx108f01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/113c/5956352/1f39e2d0347e/bxx108f02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/113c/5956352/972603a090ee/bxx108f03.jpg

相似文献

1
Relative Suffix Trees.
Comput J. 2018 May;61(5):773-788. doi: 10.1093/comjnl/bxx108. Epub 2017 Nov 21.
2
Breaking the -Barrier in the Construction of Compressed Suffix Arrays and Suffix Trees.突破压缩后缀数组和后缀树构建中的-障碍
Proc Annu ACM SIAM Symp Discret Algorithms. 2023;2023:5122-5202. doi: 10.1137/1.9781611977554.ch187.
3
Storage and retrieval of highly repetitive sequence collections.高度重复序列集合的存储与检索。
J Comput Biol. 2010 Mar;17(3):281-308. doi: 10.1089/cmb.2009.0169.
4
PFP Compressed Suffix Trees.PFP压缩后缀树
Proc Worksh Algorithm Eng Exp. 2021;2021:60-72. doi: 10.1137/1.9781611976472.5.
5
Compressed suffix tree--a basis for genome-scale sequence analysis.压缩后缀树——基因组规模序列分析的基础
Bioinformatics. 2007 Mar 1;23(5):629-30. doi: 10.1093/bioinformatics/btl681. Epub 2007 Jan 19.
6
gsufsort: constructing suffix arrays, LCP arrays and BWTs for string collections.gsufsort:为字符串集合构建后缀数组、最长公共前缀数组和Burrows-Wheeler变换
Algorithms Mol Biol. 2020 Sep 22;15:18. doi: 10.1186/s13015-020-00177-y. eCollection 2020.
7
Indexing huge genome sequences for solving various problems.为解决各种问题对庞大的基因组序列进行索引。
Genome Inform. 2001;12:175-83.
8
Using the Sadakane compressed suffix tree to solve the all-pairs suffix-prefix problem.使用笹钟根压缩后缀树解决所有后缀对前缀问题。
Biomed Res Int. 2014;2014:745298. doi: 10.1155/2014/745298. Epub 2014 Apr 16.
9
Suffix sorting via matching statistics.通过匹配统计进行后缀排序。
Algorithms Mol Biol. 2024 Mar 12;19(1):11. doi: 10.1186/s13015-023-00245-z.
10
Document retrieval on repetitive string collections.
Inf Retr Boston. 2017;20(3):253-291. doi: 10.1007/s10791-017-9297-7. Epub 2017 Apr 1.

引用本文的文献

1
Lightweight Pattern Matching Method for DNA Sequencing in Internet of Medical Things.物联网中 DNA 测序的轻量级模式匹配方法。
Comput Intell Neurosci. 2022 Sep 8;2022:6980335. doi: 10.1155/2022/6980335. eCollection 2022.

本文引用的文献

1
Metagenome SNP calling via read-colored de Bruijn graphs.通过读取颜色化的德布鲁因图进行宏基因组单核苷酸多态性(SNP)检测
Bioinformatics. 2021 Apr 1;36(22-23):5275-5281. doi: 10.1093/bioinformatics/btaa081.
2
Genome graphs and the evolution of genome inference.基因组图谱与基因组推断的演变
Genome Res. 2017 May;27(5):665-676. doi: 10.1101/gr.214155.116. Epub 2017 Mar 30.
3
Succinct colored de Bruijn graphs.简明彩色 de Bruijn 图。
Bioinformatics. 2017 Oct 15;33(20):3181-3187. doi: 10.1093/bioinformatics/btx067.
4
A global reference for human genetic variation.人类遗传变异的全球参考。
Nature. 2015 Oct 1;526(7571):68-74. doi: 10.1038/nature15393.
5
Indexing Graphs for Path Queries with Applications in Genome Research.用于路径查询的图索引及其在基因组研究中的应用
IEEE/ACM Trans Comput Biol Bioinform. 2014 Mar-Apr;11(2):375-88. doi: 10.1109/TCBB.2013.2297101.
6
Searching and Indexing Genomic Databases via Kernelization.通过核化搜索和索引基因组数据库。
Front Bioeng Biotechnol. 2015 Feb 9;3:12. doi: 10.3389/fbioe.2015.00012. eCollection 2015.
7
De novo assembly and genotyping of variants using colored de Bruijn graphs.利用有色 de Bruijn 图进行从头组装和变体基因分型。
Nat Genet. 2012 Jan 8;44(2):226-32. doi: 10.1038/ng.1028.
8
Robust relative compression of genomes with random access.具有随机访问的基因组的稳健相对压缩。
Bioinformatics. 2011 Nov 1;27(21):2979-86. doi: 10.1093/bioinformatics/btr505. Epub 2011 Sep 5.
9
AlleleSeq: analysis of allele-specific expression and binding in a network framework.AlleleSeq:在网络框架中分析等位基因特异性表达和结合。
Mol Syst Biol. 2011 Aug 2;7:522. doi: 10.1038/msb.2011.54.
10
Iterative dictionary construction for compression of large DNA data sets.迭代字典构建用于大型 DNA 数据集的压缩。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Jan-Feb;9(1):137-49. doi: 10.1109/TCBB.2011.82. Epub 2011 Apr 27.