文献检索，用中文搜 PubMed

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Lipták Zsuzsanna, Masillo Francesco, Puglisi Simon J

Department of Computer Science, University of Verona, Verona, Italy.

Helsinki Institute for Information Technology (HIIT), Helsinki, Finland.

Algorithms Mol Biol. 2024 Mar 12;19(1):11. doi: 10.1186/s13015-023-00245-z.

We introduce a new algorithm for constructing the generalized suffix array of a collection of highly similar strings. As a first step, we construct a compressed representation of the matching statistics of the collection with respect to a reference string. We then use this data structure to distribute suffixes into a partial order, and subsequently to speed up suffix comparisons to complete the generalized suffix array. Our experimental evidence with a prototype implementation (a tool we call sacamats) shows that on string collections with highly similar strings we can construct the suffix array in time competitive with or faster than the fastest available methods. Along the way, we describe a heuristic for fast computation of the matching statistics of two strings, which may be of independent interest.

我们介绍了一种用于构建高度相似字符串集合的广义后缀数组的新算法。第一步，我们构建该集合相对于一个参考字符串的匹配统计信息的压缩表示。然后，我们使用此数据结构将后缀分配到一个偏序中，并随后加速后缀比较以完成广义后缀数组的构建。我们使用原型实现（我们称之为sacamats的工具）的实验证据表明，对于具有高度相似字符串的字符串集合，我们能够以与最快的现有方法相当或更快的时间构建后缀数组。在此过程中，我们描述了一种用于快速计算两个字符串匹配统计信息的启发式方法，该方法可能具有独立的研究价值。

相似文献

Suffix sorting via matching statistics.通过匹配统计进行后缀排序。

Algorithms Mol Biol. 2024 Mar 12;19(1):11. doi: 10.1186/s13015-023-00245-z.

gsufsort: constructing suffix arrays, LCP arrays and BWTs for string collections.gsufsort：为字符串集合构建后缀数组、最长公共前缀数组和Burrows-Wheeler变换

Algorithms Mol Biol. 2020 Sep 22;15:18. doi: 10.1186/s13015-020-00177-y. eCollection 2020.

Generalized enhanced suffix array construction in external memory.外部内存中的广义增强后缀数组构造

Algorithms Mol Biol. 2017 Dec 7;12:26. doi: 10.1186/s13015-017-0117-9. eCollection 2017.

Breaking the -Barrier in the Construction of Compressed Suffix Arrays and Suffix Trees.突破压缩后缀数组和后缀树构建中的-障碍

Proc Annu ACM SIAM Symp Discret Algorithms. 2023;2023:5122-5202. doi: 10.1137/1.9781611977554.ch187.

Using the Sadakane compressed suffix tree to solve the all-pairs suffix-prefix problem.使用笹钟根压缩后缀树解决所有后缀对前缀问题。

Biomed Res Int. 2014;2014:745298. doi: 10.1155/2014/745298. Epub 2014 Apr 16.

Computing matching statistics on Wheeler DFAs.计算惠勒确定有限自动机上的匹配统计信息。

Proc Data Compress Conf. 2023 Mar;2023:150-159. doi: 10.1109/dcc55655.2023.00023. Epub 2023 May 19.

An Elegant Algorithm for the Construction of Suffix Arrays.一种构建后缀数组的优雅算法。

J Discrete Algorithms (Amst). 2014 Jul 1;27:21-28. doi: 10.1016/j.jda.2014.03.001.

Relative Suffix Trees.

Comput J. 2018 May;61(5):773-788. doi: 10.1093/comjnl/bxx108. Epub 2017 Nov 21.

Indexing huge genome sequences for solving various problems.为解决各种问题对庞大的基因组序列进行索引。

Genome Inform. 2001;12:175-83.

Computing the original eBWT faster, simpler, and with less memory.更快、更简单且占用更少内存地计算原始增强型Burrows-Wheeler变换。

Int Symp String Process Inf Retr. 2021 Oct;12944:129-142. doi: 10.1007/978-3-030-86692-1_11. Epub 2021 Sep 27.

引用本文的文献

Marker discovery in the large.在大型研究中发现标志物。

Bioinform Adv. 2024 Jul 27;4(1):vbae113. doi: 10.1093/bioadv/vbae113. eCollection 2024.

本文引用的文献

A draft human pangenome reference.人类泛基因组参考草图。

Nature. 2023 May;617(7960):312-324. doi: 10.1038/s41586-023-05896-x. Epub 2023 May 10.

Finding Maximal Exact Matches Using the r-Index.使用 r-索引查找最大精确匹配。

J Comput Biol. 2022 Feb;29(2):188-194. doi: 10.1089/cmb.2021.0445. Epub 2022 Jan 17.

Efficient Construction of a Complete Index for Pan-Genomics Read Alignment.高效构建全基因组读段比对的完整索引。

Lipták Zsuzsanna, Masillo Francesco, Puglisi Simon J

Department of Computer Science, University of Verona, Verona, Italy.

Helsinki Institute for Information Technology (HIIT), Helsinki, Finland.

Algorithms Mol Biol. 2024 Mar 12;19(1):11. doi: 10.1186/s13015-023-00245-z.

相似文献

Suffix sorting via matching statistics.通过匹配统计进行后缀排序。

Algorithms Mol Biol. 2024 Mar 12;19(1):11. doi: 10.1186/s13015-023-00245-z.

gsufsort: constructing suffix arrays, LCP arrays and BWTs for string collections.gsufsort：为字符串集合构建后缀数组、最长公共前缀数组和Burrows-Wheeler变换

Algorithms Mol Biol. 2020 Sep 22;15:18. doi: 10.1186/s13015-020-00177-y. eCollection 2020.

Generalized enhanced suffix array construction in external memory.外部内存中的广义增强后缀数组构造

Algorithms Mol Biol. 2017 Dec 7;12:26. doi: 10.1186/s13015-017-0117-9. eCollection 2017.

Breaking the -Barrier in the Construction of Compressed Suffix Arrays and Suffix Trees.突破压缩后缀数组和后缀树构建中的-障碍

Proc Annu ACM SIAM Symp Discret Algorithms. 2023;2023:5122-5202. doi: 10.1137/1.9781611977554.ch187.

Using the Sadakane compressed suffix tree to solve the all-pairs suffix-prefix problem.使用笹钟根压缩后缀树解决所有后缀对前缀问题。

Biomed Res Int. 2014;2014:745298. doi: 10.1155/2014/745298. Epub 2014 Apr 16.

Computing matching statistics on Wheeler DFAs.计算惠勒确定有限自动机上的匹配统计信息。

Proc Data Compress Conf. 2023 Mar;2023:150-159. doi: 10.1109/dcc55655.2023.00023. Epub 2023 May 19.

An Elegant Algorithm for the Construction of Suffix Arrays.一种构建后缀数组的优雅算法。

J Discrete Algorithms (Amst). 2014 Jul 1;27:21-28. doi: 10.1016/j.jda.2014.03.001.

Relative Suffix Trees.

Comput J. 2018 May;61(5):773-788. doi: 10.1093/comjnl/bxx108. Epub 2017 Nov 21.

Indexing huge genome sequences for solving various problems.为解决各种问题对庞大的基因组序列进行索引。

Genome Inform. 2001;12:175-83.

Computing the original eBWT faster, simpler, and with less memory.更快、更简单且占用更少内存地计算原始增强型Burrows-Wheeler变换。

Int Symp String Process Inf Retr. 2021 Oct;12944:129-142. doi: 10.1007/978-3-030-86692-1_11. Epub 2021 Sep 27.

引用本文的文献

Marker discovery in the large.在大型研究中发现标志物。

Bioinform Adv. 2024 Jul 27;4(1):vbae113. doi: 10.1093/bioadv/vbae113. eCollection 2024.

本文引用的文献

A draft human pangenome reference.人类泛基因组参考草图。

Nature. 2023 May;617(7960):312-324. doi: 10.1038/s41586-023-05896-x. Epub 2023 May 10.

Finding Maximal Exact Matches Using the r-Index.使用 r-索引查找最大精确匹配。

J Comput Biol. 2022 Feb;29(2):188-194. doi: 10.1089/cmb.2021.0445. Epub 2022 Jan 17.

Efficient Construction of a Complete Index for Pan-Genomics Read Alignment.高效构建全基因组读段比对的完整索引。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

通过匹配统计进行后缀排序。

Suffix sorting via matching statistics.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

通过匹配统计进行后缀排序。

Suffix sorting via matching statistics.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献