Center for Translational Data Science, University of Chicago, Chicago, Illinois, USA.
Amazon Web Services, Seattle, Washington, USA.
J Am Med Inform Assoc. 2022 Mar 15;29(4):619-625. doi: 10.1093/jamia/ocab247.
The objective was to develop and operate a cloud-based federated system for managing, analyzing, and sharing patient data for research purposes, while allowing each resource sharing patient data to operate their component based upon their own governance rules. The federated system is called the Biomedical Research Hub (BRH).
The BRH is a cloud-based federated system built over a core set of software services called framework services. BRH framework services include authentication and authorization, services for generating and assessing findable, accessible, interoperable, and reusable (FAIR) data, and services for importing and exporting bulk clinical data. The BRH includes data resources providing data operated by different entities and workspaces that can access and analyze data from one or more of the data resources in the BRH.
The BRH contains multiple data commons that in aggregate provide access to over 6 PB of research data from over 400 000 research participants.
With the growing acceptance of using public cloud computing platforms for biomedical research, and the growing use of opaque persistent digital identifiers for datasets, data objects, and other entities, there is now a foundation for systems that federate data from multiple independently operated data resources that expose FAIR application programming interfaces, each using a separate data model. Applications can be built that access data from one or more of the data resources.
开发和运营一个基于云的联邦系统,用于管理、分析和共享患者数据以进行研究,同时允许每个共享患者数据的资源根据自己的治理规则运行其组件。该联邦系统称为生物医学研究中心(BRH)。
BRH 是一个基于云的联邦系统,构建在一组名为框架服务的核心软件服务之上。BRH 框架服务包括身份验证和授权、用于生成和评估可发现、可访问、可互操作和可重用(FAIR)数据的服务,以及用于导入和导出批量临床数据的服务。BRH 包括提供由不同实体操作的数据资源和可以访问和分析 BRH 中一个或多个数据资源中的数据的工作区。
BRH 包含多个数据公共资源,这些公共资源总共提供了超过 6 PB 的来自超过 400000 名研究参与者的研究数据的访问权限。
随着越来越多地接受使用公共云计算平台进行生物医学研究,以及越来越多地使用不透明的持久数字标识符来标识数据集、数据对象和其他实体,现在已经为系统奠定了基础,这些系统可以从多个独立操作的数据资源中联合数据,这些数据资源公开 FAIR 应用程序编程接口,每个接口使用单独的数据模型。可以构建从一个或多个数据资源访问数据的应用程序。