Suppr超能文献

SRAdb:在 R 中查询和使用公共下一代测序数据。

SRAdb: query and use public next-generation sequencing data from within R.

机构信息

Genetics Branch, National Cancer Institute, National Institutes of HealthBethesda, MD 20892, USA.

出版信息

BMC Bioinformatics. 2013 Jan 17;14:19. doi: 10.1186/1471-2105-14-19.

Abstract

BACKGROUND

The Sequence Read Archive (SRA) is the largest public repository of sequencing data from the next generation of sequencing platforms including Illumina (Genome Analyzer, HiSeq, MiSeq, .etc), Roche 454 GS System, Applied Biosystems SOLiD System, Helicos Heliscope, PacBio RS, and others.

RESULTS

SRAdb is an attempt to make queries of the metadata associated with SRA submission, study, sample, experiment and run more robust and precise, and make access to sequencing data in the SRA easier. We have parsed all the SRA metadata into a SQLite database that is routinely updated and can be easily distributed. The SRAdb R/Bioconductor package then utilizes this SQLite database for querying and accessing metadata. Full text search functionality makes querying metadata very flexible and powerful. Fastq files associated with query results can be downloaded easily for local analysis. The package also includes an interface from R to a popular genome browser, the Integrated Genomics Viewer.

CONCLUSIONS

SRAdb Bioconductor package provides a convenient and integrated framework to query and access SRA metadata quickly and powerfully from within R.

摘要

背景

序列读取档案 (SRA) 是最大的下一代测序平台测序数据公共存储库,包括 Illumina (基因组分析仪、HiSeq、MiSeq 等)、Roche 454 GS 系统、Applied Biosystems SOLiD 系统、Helicos Heliscope、PacBio RS 等。

结果

SRAdb 试图使对 SRA 提交、研究、样本、实验和运行相关元数据的查询更健壮和精确,并使 SRA 中的测序数据更容易访问。我们已经将所有 SRA 元数据解析到 SQLite 数据库中,该数据库定期更新,并且可以轻松分发。然后,SRAdb R/Bioconductor 包利用这个 SQLite 数据库进行查询和访问元数据。全文搜索功能使查询元数据非常灵活和强大。与查询结果相关的 Fastq 文件可以轻松下载进行本地分析。该包还包括从 R 到流行的基因组浏览器,即集成基因组浏览器的接口。

结论

SRAdb Bioconductor 包提供了一个方便和集成的框架,从 R 中快速、强大地查询和访问 SRA 元数据。

相似文献

5
The Sequence Read Archive: explosive growth of sequencing data.序列读取档案:测序数据的爆炸式增长。
Nucleic Acids Res. 2012 Jan;40(Database issue):D54-6. doi: 10.1093/nar/gkr854. Epub 2011 Oct 18.
9
The sequence read archive.序列读取存档库。
Nucleic Acids Res. 2011 Jan;39(Database issue):D19-21. doi: 10.1093/nar/gkq1019. Epub 2010 Nov 9.

引用本文的文献

6
Metadata retrieval from sequence databases with ffq.利用 ffq 从序列数据库中检索元数据。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac667.
10
Insights into the global freshwater virome.对全球淡水病毒群落的洞察。
Front Microbiol. 2022 Sep 28;13:953500. doi: 10.3389/fmicb.2022.953500. eCollection 2022.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验