Ramos Marcel, Schiffer Lucas, Re Angela, Azhar Rimsha, Basunia Azfar, Rodriguez Carmen, Chan Tiffany, Chapman Phil, Davis Sean R, Gomez-Cabrero David, Culhane Aedin C, Haibe-Kains Benjamin, Hansen Kasper D, Kodali Hanish, Louis Marie S, Mer Arvind S, Riester Markus, Morgan Martin, Carey Vince, Waldron Levi
Graduate School of Public Health & Health Policy, City University of New York, New York, New York.
Institute for Implementation Science in Population Health, City University of New York, New York, New York.
Cancer Res. 2017 Nov 1;77(21):e39-e42. doi: 10.1158/0008-5472.CAN-17-0344.
Multiomics experiments are increasingly commonplace in biomedical research and add layers of complexity to experimental design, data integration, and analysis. R and Bioconductor provide a generic framework for statistical analysis and visualization, as well as specialized data classes for a variety of high-throughput data types, but methods are lacking for integrative analysis of multiomics experiments. The MultiAssayExperiment software package, implemented in R and leveraging Bioconductor software and design principles, provides for the coordinated representation of, storage of, and operation on multiple diverse genomics data. We provide the unrestricted multiple 'omics data for each cancer tissue in The Cancer Genome Atlas as ready-to-analyze MultiAssayExperiment objects and demonstrate in these and other datasets how the software simplifies data representation, statistical analysis, and visualization. The MultiAssayExperiment Bioconductor package reduces major obstacles to efficient, scalable, and reproducible statistical analysis of multiomics data and enhances data science applications of multiple omics datasets. .
多组学实验在生物医学研究中越来越普遍,给实验设计、数据整合和分析增加了复杂性。R和Bioconductor提供了一个用于统计分析和可视化的通用框架,以及针对各种高通量数据类型的专用数据类,但缺乏用于多组学实验综合分析的方法。在R中实现并利用Bioconductor软件和设计原则的MultiAssayExperiment软件包,提供了对多种不同基因组学数据的协调表示、存储和操作。我们将癌症基因组图谱中每个癌症组织的无限制多组学数据作为随时可分析的MultiAssayExperiment对象提供,并在这些数据集和其他数据集中展示该软件如何简化数据表示、统计分析和可视化。MultiAssayExperiment Bioconductor软件包减少了对多组学数据进行高效、可扩展和可重复统计分析的主要障碍,并增强了多个组学数据集的数据科学应用。