Siebert Janet C, Munsil Wes, Rosenberg-Hasson Yael, Davis Mark M, Maecker Holden T
CytoAnalytics, Denver, CO, USA.
J Transl Med. 2012 Mar 28;10:62. doi: 10.1186/1479-5876-10-62.
Systems-level approaches are increasingly common in both murine and human translational studies. These approaches employ multiple high information content assays. As a result, there is a need for tools to integrate heterogeneous types of laboratory and clinical/demographic data, and to allow the exploration of that data by aggregating and/or segregating results based on particular variables (e.g., mean cytokine levels by age and gender).
Here we describe the application of standard data warehousing tools to create a novel environment for user-driven upload, integration, and exploration of heterogeneous data. The system presented here currently supports flow cytometry and immunoassays performed in the Stanford Human Immune Monitoring Center, but could be applied more generally.
Users upload assay results contained in platform-specific spreadsheets of a defined format, and clinical and demographic data in spreadsheets of flexible format. Users then map sample IDs to connect the assay results with the metadata. An OLAP (on-line analytical processing) data exploration interface allows filtering and display of various dimensions (e.g., Luminex analytes in rows, treatment group in columns, filtered on a particular study). Statistics such as mean, median, and N can be displayed. The views can be expanded or contracted to aggregate or segregate data at various levels. Individual-level data is accessible with a single click. The result is a user-driven system that permits data integration and exploration in a variety of settings. We show how the system can be used to find gender-specific differences in serum cytokine levels, and compare them across experiments and assay types.
We have used the tools and techniques of data warehousing, including open-source business intelligence software, to support investigator-driven data integration and mining of diverse immunological data.
系统级方法在小鼠和人类转化研究中越来越普遍。这些方法采用多种高信息量的检测方法。因此,需要工具来整合不同类型的实验室和临床/人口统计学数据,并通过基于特定变量(例如按年龄和性别划分的平均细胞因子水平)汇总和/或分离结果来探索这些数据。
在此,我们描述了标准数据仓库工具的应用,以创建一个用户驱动的上传、整合和探索异构数据的新环境。此处介绍的系统目前支持斯坦福人类免疫监测中心进行的流式细胞术和免疫测定,但也可更广泛地应用。
用户上传特定平台定义格式电子表格中包含的检测结果,以及灵活格式电子表格中的临床和人口统计学数据。然后,用户映射样本ID以将检测结果与元数据连接起来。OLAP(在线分析处理)数据探索界面允许对各种维度进行过滤和显示(例如,行中的Luminex分析物、列中的治疗组,按特定研究进行过滤)。可以显示均值、中位数和N等统计数据。视图可以展开或收缩,以在不同级别汇总或分离数据。单击一下即可访问个体级数据。结果是一个用户驱动的系统,允许在各种设置中进行数据集成和探索。我们展示了该系统如何用于发现血清细胞因子水平的性别特异性差异,并在不同实验和检测类型之间进行比较。
我们使用了数据仓库的工具和技术,包括开源商业智能软件,来支持研究者驱动的数据集成和对各种免疫数据的挖掘。