Martínez-Guardiola César, Brown Nathaniel K, Silva-Coira Fernando, Köppl Dominik, Gagie Travis, Ladra Susana
Universidade da Coruña, CITIC, A Coruña, Spain.
Dalhousie U, Halifax, Canada.
Proc Data Compress Conf. 2023 Mar;2023:268-277. doi: 10.1109/dcc55655.2023.00035. Epub 2023 May 19.
MONI (Rossi et al., 2022) can store a pangenomic dataset in small space and later, given a pattern , quickly find the maximal exact matches (MEMs) of with respect to . In this paper we consider its one-pass version (Boucher et al., 2021), whose query times are dominated in our experiments by longest common extension (LCE) queries. We show how a small modification lets us avoid most of these queries which significantly speeds up MONI in practice while only slightly increasing its size.
MONI(罗西等人,2022年)可以在小空间中存储一个泛基因组数据集,之后,给定一个模式,能快速找到该模式相对于数据集的最大精确匹配(MEMs)。在本文中,我们考虑其单遍版本(布歇等人,2021年),在我们的实验中,其查询时间主要由最长公共扩展(LCE)查询决定。我们展示了一个小修改如何使我们避免大多数此类查询,这在实际中显著加快了MONI的速度,同时仅略微增加其规模。