Center for Bioinformatics and Computational Biology.
Institute for Advanced Computer Studies.
Bioinformatics. 2020 Sep 15;36(18):4682-4690. doi: 10.1093/bioinformatics/btaa591.
Genomic data repositories like The Cancer Genome Atlas, Encyclopedia of DNA Elements, Bioconductor's AnnotationHub and ExperimentHub etc., provide public access to large amounts of genomic data as flat files. Researchers often download a subset of data files from these repositories to perform exploratory data analysis. We developed Epiviz File Server, a Python library that implements an in situ data query system for local or remotely hosted indexed genomic files, not only for visualization but also data transformation. The File Server library decouples data retrieval and transformation from specific visualization and analysis tools and provides an abstract interface to define computations independent of the location, format or structure of the file. We demonstrate the File Server in two use cases: (i) integration with Galaxy workflows and (ii) using Epiviz to create a custom genome browser from the Epigenome Roadmap dataset.
Epiviz File Server is open source and is available on GitHub at http://github.com/epiviz/epivizFileServer. The documentation for the File Server library is available at http://epivizfileserver.rtfd.io.
基因组数据库,如癌症基因组图谱(The Cancer Genome Atlas)、DNA 元件百科全书(Encyclopedia of DNA Elements)、Bioconductor 的 AnnotationHub 和 ExperimentHub 等,以平面文件的形式提供对大量基因组数据的公共访问。研究人员经常从这些存储库中下载一部分数据文件,以执行探索性数据分析。我们开发了 Epiviz File Server,这是一个 Python 库,它实现了一种就地数据查询系统,用于本地或远程托管的索引基因组文件,不仅用于可视化,还用于数据转换。File Server 库将数据检索和转换与特定的可视化和分析工具分离,并提供一个抽象接口,用于定义与文件的位置、格式或结构无关的计算。我们通过两个用例演示了 File Server:(i)与 Galaxy 工作流程集成,(ii)使用 Epiviz 从表观基因组图谱数据集创建自定义基因组浏览器。
Epiviz File Server 是开源的,并可在 GitHub 上获得,网址为 http://github.com/epiviz/epivizFileServer。File Server 库的文档可在 http://epivizfileserver.rtfd.io 获得。