Snaptron：查询数以万计的 RNA-seq 样本中的剪接模式。

Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples.

机构信息

Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.

Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA.

出版信息

Bioinformatics. 2018 Jan 1;34(1):114-116. doi: 10.1093/bioinformatics/btx547.

DOI:10.1093/bioinformatics/btx547

PMID:28968689

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5870547/

Abstract

MOTIVATION

As more and larger genomics studies appear, there is a growing need for comprehensive and queryable cross-study summaries. These enable researchers to leverage vast datasets that would otherwise be difficult to obtain.

RESULTS

Snaptron is a search engine for summarized RNA sequencing data with a query planner that leverages R-tree, B-tree and inverted indexing strategies to rapidly execute queries over 146 million exon-exon splice junctions from over 70 000 human RNA-seq samples. Queries can be tailored by constraining which junctions and samples to consider. Snaptron can score junctions according to tissue specificity or other criteria, and can score samples according to the relative frequency of different splicing patterns. We describe the software and outline biological questions that can be explored with Snaptron queries.

AVAILABILITY AND IMPLEMENTATION

Documentation is at http://snaptron.cs.jhu.edu. Source code is at https://github.com/ChristopherWilks/snaptron and https://github.com/ChristopherWilks/snaptron-experiments with a CC BY-NC 4.0 license.

CONTACT

chris.wilks@jhu.edu or langmea@cs.jhu.edu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

随着越来越多的大型基因组学研究出现，人们对全面且可查询的跨研究摘要的需求日益增长。这些摘要使研究人员能够利用庞大的数据集，否则这些数据集将难以获取。

结果

Snaptron 是一个汇总 RNA 测序数据的搜索引擎，它的查询规划器利用 R 树、B 树和倒排索引策略，能够快速执行对来自 70000 多个人类 RNA-seq 样本的超过 14600 万个外显子-外显子剪接连接的查询。可以通过约束要考虑的连接和样本来定制查询。Snaptron 可以根据组织特异性或其他标准对连接进行评分，也可以根据不同剪接模式的相对频率对样本进行评分。我们描述了该软件，并概述了可以通过 Snaptron 查询探索的生物学问题。

可用性和实现

文档位于 http://snaptron.cs.jhu.edu。源代码位于 https://github.com/ChristopherWilks/snaptron 和 https://github.com/ChristopherWilks/snaptron-experiments，并附有 CC BY-NC 4.0 许可证。

联系人

chris.wilks@jhu.edu 或 langmea@cs.jhu.edu。

补充信息

补充数据可在 Bioinformatics 在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d22a/5870547/536ebef66f62/btx547f1.jpg

相似文献

Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples.

Bioinformatics. 2018 Jan 1;34(1):114-116. doi: 10.1093/bioinformatics/btx547.

Rail-RNA: scalable analysis of RNA-seq splicing and coverage.

Bioinformatics. 2017 Dec 15;33(24):4033-4040. doi: 10.1093/bioinformatics/btw575.

SNPlice: variants that modulate Intron retention from RNA-sequencing data.

Bioinformatics. 2015 Apr 15;31(8):1191-8. doi: 10.1093/bioinformatics/btu804. Epub 2014 Dec 6.

rSeqNP: a non-parametric approach for detecting differential expression and splicing from RNA-Seq data.

Bioinformatics. 2015 Jul 1;31(13):2222-4. doi: 10.1093/bioinformatics/btv119. Epub 2015 Feb 24.

smallWig: parallel compression of RNA-seq WIG files.

Bioinformatics. 2016 Jan 15;32(2):173-80. doi: 10.1093/bioinformatics/btv561. Epub 2015 Sep 30.

ChopStitch: exon annotation and splice graph construction using transcriptome assembly and whole genome sequencing data.

Bioinformatics. 2018 May 15;34(10):1697-1704. doi: 10.1093/bioinformatics/btx839.

Squeakr: an exact and approximate k-mer counting system.

Bioinformatics. 2018 Feb 15;34(4):568-575. doi: 10.1093/bioinformatics/btx636.

CIDANE: comprehensive isoform discovery and abundance estimation.

Genome Biol. 2016 Jan 30;17:16. doi: 10.1186/s13059-015-0865-0.

SplicingCompass: differential splicing detection using RNA-seq data.

Bioinformatics. 2013 May 1;29(9):1141-8. doi: 10.1093/bioinformatics/btt101. Epub 2013 Feb 28.

Read-Split-Run: an improved bioinformatics pipeline for identification of genome-wide non-canonical spliced regions using RNA-Seq data.

BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):503. doi: 10.1186/s12864-016-2896-7.

引用本文的文献

Large-scale RNA-Seq mining reveals ciclopirox olamine induces TDP-43 cryptic exons.

Nat Commun. 2025 Jul 25;16(1):6878. doi: 10.1038/s41467-025-62004-5.

Clinically relevant pseudoexons of the GALNS gene and their antisense-based correction.

Mol Med. 2025 May 17;31(1):196. doi: 10.1186/s10020-025-01243-0.

Single-cell RNA sequencing of peripheral blood links cell-type-specific regulation of splicing to autoimmune and inflammatory diseases.

Nat Genet. 2024 Dec;56(12):2739-2752. doi: 10.1038/s41588-024-02019-8. Epub 2024 Dec 3.

Machine learning-optimized targeted detection of alternative splicing.

bioRxiv. 2024 Sep 24:2024.09.20.614162. doi: 10.1101/2024.09.20.614162.

Elevated nuclear TDP-43 induces constitutive exon skipping.

Mol Neurodegener. 2024 Jun 9;19(1):45. doi: 10.1186/s13024-024-00732-w.

An endogenous retrovirus regulates tumor-specific expression of the immune transcriptional regulator SP140.

Hum Mol Genet. 2024 Aug 6;33(16):1454-1464. doi: 10.1093/hmg/ddae084.

Large-scale RNA-seq mining reveals ciclopirox triggers TDP-43 cryptic exons.

bioRxiv. 2024 Mar 30:2024.03.27.587011. doi: 10.1101/2024.03.27.587011.

All exons are not created equal-exon vulnerability determines the effect of exonic mutations on splicing.

Nucleic Acids Res. 2024 May 8;52(8):4588-4603. doi: 10.1093/nar/gkae077.

MAJIQlopedia: an encyclopedia of RNA splicing variations in human tissues and cancer.

Nucleic Acids Res. 2024 Jan 5;52(D1):D213-D221. doi: 10.1093/nar/gkad1043.

Computational prediction of human deep intronic variation.

Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad085. Epub 2023 Oct 25.

本文引用的文献

Reproducible RNA-seq analysis using recount2.

Nat Biotechnol. 2017 Apr 11;35(4):319-321. doi: 10.1038/nbt.3838.

Widespread splicing of repetitive element loci into coding regions of gene transcripts.

Hum Mol Genet. 2016 Nov 15;25(22):4962-4982. doi: 10.1093/hmg/ddw321.

Human splicing diversity and the extent of unannotated splice junctions across human RNA-seq samples on the Sequence Read Archive.

Genome Biol. 2016 Dec 30;17(1):266. doi: 10.1186/s13059-016-1118-6.

Rail-RNA: scalable analysis of RNA-seq splicing and coverage.

Bioinformatics. 2017 Dec 15;33(24):4033-4040. doi: 10.1093/bioinformatics/btw575.

Prediction and Quantification of Splice Events from RNA-Seq Data.

PLoS One. 2016 May 24;11(5):e0156132. doi: 10.1371/journal.pone.0156132. eCollection 2016.

Rail-dbGaP: analyzing dbGaP-protected data in the cloud with Amazon Elastic MapReduce.

Bioinformatics. 2016 Aug 15;32(16):2551-3. doi: 10.1093/bioinformatics/btw177. Epub 2016 Apr 21.

Fast search of thousands of short-read sequencing experiments.

Nat Biotechnol. 2016 Mar;34(3):300-2. doi: 10.1038/nbt.3442. Epub 2016 Feb 8.

Expression Atlas update--an integrated database of gene and protein expression in humans, animals and plants.

Nucleic Acids Res. 2016 Jan 4;44(D1):D746-52. doi: 10.1093/nar/gkv1045. Epub 2015 Oct 19.

Alternative transcription initiation leads to expression of a novel ALK isoform in cancer.

Nature. 2015 Oct 15;526(7573):453-7. doi: 10.1038/nature15258. Epub 2015 Oct 7.

ArrayExpress update--simplifying data submissions.

Nucleic Acids Res. 2015 Jan;43(Database issue):D1113-6. doi: 10.1093/nar/gku1057. Epub 2014 Oct 31.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Snaptron：查询数以万计的 RNA-seq 样本中的剪接模式。

Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples.

机构信息

Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.

Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA.

出版信息

Bioinformatics. 2018 Jan 1;34(1):114-116. doi: 10.1093/bioinformatics/btx547.

DOI:10.1093/bioinformatics/btx547

PMID:28968689

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5870547/

Abstract

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

CONTACT

chris.wilks@jhu.edu or langmea@cs.jhu.edu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

结果

可用性和实现

联系人

chris.wilks@jhu.edu 或 langmea@cs.jhu.edu。

补充信息

补充数据可在 Bioinformatics 在线获取。

Snaptron：查询数以万计的 RNA-seq 样本中的剪接模式。

Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

CONTACT

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

联系人

补充信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Snaptron：查询数以万计的 RNA-seq 样本中的剪接模式。

Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

CONTACT

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

联系人

补充信息

相似文献

引用本文的文献

本文引用的文献