Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.
Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA.
Bioinformatics. 2018 Jan 1;34(1):114-116. doi: 10.1093/bioinformatics/btx547.
As more and larger genomics studies appear, there is a growing need for comprehensive and queryable cross-study summaries. These enable researchers to leverage vast datasets that would otherwise be difficult to obtain.
Snaptron is a search engine for summarized RNA sequencing data with a query planner that leverages R-tree, B-tree and inverted indexing strategies to rapidly execute queries over 146 million exon-exon splice junctions from over 70 000 human RNA-seq samples. Queries can be tailored by constraining which junctions and samples to consider. Snaptron can score junctions according to tissue specificity or other criteria, and can score samples according to the relative frequency of different splicing patterns. We describe the software and outline biological questions that can be explored with Snaptron queries.
Documentation is at http://snaptron.cs.jhu.edu. Source code is at https://github.com/ChristopherWilks/snaptron and https://github.com/ChristopherWilks/snaptron-experiments with a CC BY-NC 4.0 license.
chris.wilks@jhu.edu or langmea@cs.jhu.edu.
Supplementary data are available at Bioinformatics online.
随着越来越多的大型基因组学研究出现,人们对全面且可查询的跨研究摘要的需求日益增长。这些摘要使研究人员能够利用庞大的数据集,否则这些数据集将难以获取。
Snaptron 是一个汇总 RNA 测序数据的搜索引擎,它的查询规划器利用 R 树、B 树和倒排索引策略,能够快速执行对来自 70000 多个人类 RNA-seq 样本的超过 14600 万个外显子-外显子剪接连接的查询。可以通过约束要考虑的连接和样本来定制查询。Snaptron 可以根据组织特异性或其他标准对连接进行评分,也可以根据不同剪接模式的相对频率对样本进行评分。我们描述了该软件,并概述了可以通过 Snaptron 查询探索的生物学问题。
文档位于 http://snaptron.cs.jhu.edu。源代码位于 https://github.com/ChristopherWilks/snaptron 和 https://github.com/ChristopherWilks/snaptron-experiments,并附有 CC BY-NC 4.0 许可证。
chris.wilks@jhu.edu 或 langmea@cs.jhu.edu。
补充数据可在 Bioinformatics 在线获取。