Genetics Branch, National Cancer Institute, National Institutes of HealthBethesda, MD 20892, USA.
BMC Bioinformatics. 2013 Jan 17;14:19. doi: 10.1186/1471-2105-14-19.
The Sequence Read Archive (SRA) is the largest public repository of sequencing data from the next generation of sequencing platforms including Illumina (Genome Analyzer, HiSeq, MiSeq, .etc), Roche 454 GS System, Applied Biosystems SOLiD System, Helicos Heliscope, PacBio RS, and others.
SRAdb is an attempt to make queries of the metadata associated with SRA submission, study, sample, experiment and run more robust and precise, and make access to sequencing data in the SRA easier. We have parsed all the SRA metadata into a SQLite database that is routinely updated and can be easily distributed. The SRAdb R/Bioconductor package then utilizes this SQLite database for querying and accessing metadata. Full text search functionality makes querying metadata very flexible and powerful. Fastq files associated with query results can be downloaded easily for local analysis. The package also includes an interface from R to a popular genome browser, the Integrated Genomics Viewer.
SRAdb Bioconductor package provides a convenient and integrated framework to query and access SRA metadata quickly and powerfully from within R.
序列读取档案 (SRA) 是最大的下一代测序平台测序数据公共存储库,包括 Illumina (基因组分析仪、HiSeq、MiSeq 等)、Roche 454 GS 系统、Applied Biosystems SOLiD 系统、Helicos Heliscope、PacBio RS 等。
SRAdb 试图使对 SRA 提交、研究、样本、实验和运行相关元数据的查询更健壮和精确,并使 SRA 中的测序数据更容易访问。我们已经将所有 SRA 元数据解析到 SQLite 数据库中,该数据库定期更新,并且可以轻松分发。然后,SRAdb R/Bioconductor 包利用这个 SQLite 数据库进行查询和访问元数据。全文搜索功能使查询元数据非常灵活和强大。与查询结果相关的 Fastq 文件可以轻松下载进行本地分析。该包还包括从 R 到流行的基因组浏览器,即集成基因组浏览器的接口。
SRAdb Bioconductor 包提供了一个方便和集成的框架,从 R 中快速、强大地查询和访问 SRA 元数据。