BMC Bioinformatics. 2014;15 Suppl 1(Suppl 1):S11. doi: 10.1186/1471-2105-15-S1-S11. Epub 2014 Jan 10.
The ISA-Tab format and software suite have been developed to break the silo effect induced by technology-specific formats for a variety of data types and to better support experimental metadata tracking. Experimentalists seldom use a single technique to monitor biological signals. Providing a multi-purpose, pragmatic and accessible format that abstracts away common constructs for describing Investigations, Studies and Assays, ISA is increasingly popular. To attract further interest towards the format and extend support to ensure reproducible research and reusable data, we present the Risa package, which delivers a central component to support the ISA format by enabling effortless integration with R, the popular, open source data crunching environment.
The Risa package bridges the gap between the metadata collection and curation in an ISA-compliant way and the data analysis using the widely used statistical computing environment R. The package offers functionality for: i) parsing ISA-Tab datasets into R objects, ii) augmenting annotation with extra metadata not explicitly stated in the ISA syntax; iii) interfacing with domain specific R packages iv) suggesting potentially useful R packages available in Bioconductor for subsequent processing of the experimental data described in the ISA format; and finally v) saving back to ISA-Tab files augmented with analysis specific metadata from R. We demonstrate these features by presenting use cases for mass spectrometry data and DNA microarray data.
The Risa package is open source (with LGPL license) and freely available through Bioconductor. By making Risa available, we aim to facilitate the task of processing experimental data, encouraging a uniform representation of experimental information and results while delivering tools for ensuring traceability and provenance tracking.
The Risa package is available since Bioconductor 2.11 (version 1.0.0) and version 1.2.1 appeared in Bioconductor 2.12, both along with documentation and examples. The latest version of the code is at the development branch in Bioconductor and can also be accessed from GitHub https://github.com/ISA-tools/Risa, where the issue tracker allows users to report bugs or feature requests.
ISA-Tab 格式和软件套件的开发旨在打破由于各种数据类型的技术特定格式而导致的信息孤岛效应,并更好地支持实验元数据跟踪。实验人员很少使用单一技术来监测生物信号。ISA 提供了一种多用途、实用且易于访问的格式,它抽象出了用于描述研究、实验和分析的常见结构,因此越来越受欢迎。为了进一步吸引人们对该格式的兴趣,并扩展支持以确保可重复的研究和可重复使用的数据,我们提出了 Risa 包,它通过提供一个核心组件来支持 ISA 格式,从而实现与流行的开源数据处理环境 R 的轻松集成。
Risa 包以符合 ISA 的方式弥合了元数据收集和管理方面的差距,以及使用广泛使用的统计计算环境 R 进行数据分析之间的差距。该包提供了以下功能:i)将 ISA-Tab 数据集解析为 R 对象,ii)使用 ISA 语法中未明确说明的额外元数据扩充注释;iii)与特定于域的 R 包接口;iv)为随后处理以 ISA 格式描述的实验数据,建议在 Bioconductor 中可用的潜在有用的 R 包;最后 v)将数据保存回 ISA-Tab 文件,并从 R 中添加分析特定的元数据。我们通过演示质谱数据和 DNA 微阵列数据的用例来展示这些功能。
Risa 包是开源的(使用 LGPL 许可证),并可通过 Bioconductor 免费获得。通过提供 Risa,我们旨在简化处理实验数据的任务,鼓励对实验信息和结果进行统一表示,同时提供确保可追溯性和出处跟踪的工具。
Risa 包自 Bioconductor 2.11(版本 1.0.0)以来可用,版本 1.2.1 出现在 Bioconductor 2.12 中,同时提供了文档和示例。代码的最新版本位于 Bioconductor 的开发分支中,也可以从 GitHub https://github.com/ISA-tools/Risa 访问,在那里,问题跟踪器允许用户报告错误或功能请求。