Center for Genomic Science of IIT@SEMM, Fondazione Istituto Italiano di Tecnologia (IIT), 20139, Milan, Italy.
IEO, European Institute of Oncology IRCCS, 20141, Milan, Italy.
BMC Bioinformatics. 2020 Oct 19;21(1):464. doi: 10.1186/s12859-020-03781-2.
Genome browsers are widely used for locating interesting genomic regions, but their interactive use is obviously limited to inspecting short genomic portions. An ideal interaction is to provide patterns of regions on the browser, and then extract other genomic regions over the whole genome where such patterns occur, ranked by similarity.
We developed SimSearch, an optimized pattern-search method and an open source plugin for the Integrated Genome Browser (IGB), to find genomic region sets that are similar to a given region pattern. It provides efficient visual genome-wide analytics computation in large datasets; the plugin supports intuitive user interactions for selecting an interesting pattern on IGB tracks and visualizing the computed occurrences of similar patterns along the entire genome. SimSearch also includes functions for the annotation and enrichment of results, and is enhanced with a Quickload repository including numerous epigenomic feature datasets from ENCODE and Roadmap Epigenomics. The paper also includes some use cases to show multiple genome-wide analyses of biological interest, which can be easily performed by taking advantage of the presented approach.
The novel SimSearch method provides innovative support for effective genome-wide pattern search and visualization; its relevance and practical usefulness is demonstrated through a number of significant use cases of biological interest. The SimSearch IGB plugin, documentation, and code are freely available at https://deib-geco.github.io/simsearch-app/ and https://github.com/DEIB-GECO/simsearch-app/ .
基因组浏览器被广泛用于定位有趣的基因组区域,但它们的交互使用显然仅限于检查短的基因组部分。理想的交互方式是在浏览器上提供区域模式,然后提取整个基因组中出现这些模式的其他基因组区域,并按相似性进行排序。
我们开发了 SimSearch,这是一种针对集成基因组浏览器(IGB)的优化模式搜索方法和开源插件,用于查找与给定区域模式相似的基因组区域集。它在大型数据集上提供了高效的可视化全基因组分析计算;该插件支持在 IGB 轨道上选择有趣模式的直观用户交互,并可视化整个基因组上相似模式的计算出现情况。SimSearch 还包括对结果进行注释和富集的功能,并通过 Quickload 存储库得到增强,其中包含来自 ENCODE 和 Roadmap Epigenomics 的众多表观基因组特征数据集。本文还包括一些用例,展示了多个具有生物学意义的全基因组分析,可以通过利用所提出的方法轻松完成。
新颖的 SimSearch 方法为有效的全基因组模式搜索和可视化提供了创新支持;通过一些具有重要生物学意义的用例,证明了其相关性和实际有用性。SimSearch IGB 插件、文档和代码可在 https://deib-geco.github.io/simsearch-app/ 和 https://github.com/DEIB-GECO/simsearch-app/ 上免费获得。