Christoph J, Griebel L, Leb I, Engel I, Köpcke F, Toddenroth D, Prokosch H-U, Laufer J, Marquardt K, Sedlmayr M
Dr. Martin Sedlmayr, Lehrstuhl für Medizinische Informatik, Friedrich-Alexander-Universität Erlangen-Nürnberg, Wetterkreuz 13, 91058 Erlangen, Germany, E-mail:
Methods Inf Med. 2015;54(3):276-82. doi: 10.3414/ME13-01-0133. Epub 2014 Nov 7.
The secondary use of clinical data provides large opportunities for clinical and translational research as well as quality assurance projects. For such purposes, it is necessary to provide a flexible and scalable infrastructure that is compliant with privacy requirements. The major goals of the cloud4health project are to define such an architecture, to implement a technical prototype that fulfills these requirements and to evaluate it with three use cases.
The architecture provides components for multiple data provider sites such as hospitals to extract free text as well as structured data from local sources and de-identify such data for further anonymous or pseudonymous processing. Free text documentation is analyzed and transformed into structured information by text-mining services, which are provided within a cloud-computing environment. Thus, newly gained annotations can be integrated along with the already available structured data items and the resulting data sets can be uploaded to a central study portal for further analysis.
Based on the architecture design, a prototype has been implemented and is under evaluation in three clinical use cases. Data from several hundred patients provided by a University Hospital and a private hospital chain have already been processed.
Cloud4health has shown how existing components for secondary use of structured data can be complemented with text-mining in a privacy compliant manner. The cloud-computing paradigm allows a flexible and dynamically adaptable service provision that facilitates the adoption of services by data providers without own investments in respective hardware resources and software tools.
临床数据的二次利用为临床和转化研究以及质量保证项目提供了大量机会。出于此类目的,有必要提供一个符合隐私要求的灵活且可扩展的基础设施。cloud4health项目的主要目标是定义这样一种架构,实现一个满足这些要求的技术原型,并通过三个用例对其进行评估。
该架构为多个数据提供方站点(如医院)提供组件,以便从本地源提取自由文本以及结构化数据,并对这些数据进行去标识化处理,以便进一步进行匿名或假名处理。自由文本文档通过在云计算环境中提供的文本挖掘服务进行分析并转化为结构化信息。这样,新获得的注释可以与已有的结构化数据项集成,生成的数据集可以上传到中央研究门户进行进一步分析。
基于架构设计,已实现一个原型,并正在三个临床用例中进行评估。一所大学医院和一家私立连锁医院提供的数百名患者的数据已经过处理。
cloud4health展示了如何以符合隐私的方式通过文本挖掘对现有的结构化数据二次利用组件进行补充。云计算范式允许灵活且动态适应的服务提供,便于数据提供方采用服务,而无需自行投资相应的硬件资源和软件工具。