Bowles Juliana K F, Mendoza-Santana Juan, Vermeulen Andreas F, Webber Thais, Blackledge Euan
School of Computer Science, University of St Andrews, UK.
Sopra Steria, Edinburgh, UK.
Stud Health Technol Inform. 2020 Nov 23;275:17-21. doi: 10.3233/SHTI200686.
The potential of healthcare systems worldwide is expanding as new medical devices and data sources are regularly presented to healthcare providers which could be used to personalise, improve and revise treatments further. However, there is presently a large gap between the data collected, the systems that store the data, and any ability to perform big data analytics to combinations of such data. This paper suggests a novel approach to integrate data from multiple sources and formats, by providing a uniform structure to the data in a healthcare data lake with multiple zones reflecting how refined the data is: from raw to curated when ready to be consumed or used for analysis. The integration further requires solutions that can be proven to be secure, such as patient-centric data sharing agreements (smart contracts) on a blockchain, and novel privacy-preserving methods for extracting metadata from data sources, originally derived from partially-structured or from completely unstructured data. Work presented here is being developed as part of an EU project with the ultimate aim to develop solutions for integrating healthcare data for enhanced citizen-centred care and analytics across Europe.
随着新的医疗设备和数据源不断呈现给医疗服务提供者,全球医疗系统的潜力正在扩大,这些设备和数据源可用于进一步实现治疗的个性化、改进和修订。然而,目前在收集到的数据、存储数据的系统以及对这些数据组合进行大数据分析的能力之间存在很大差距。本文提出了一种新颖的方法,通过在医疗数据湖中为数据提供统一结构来集成来自多个源和格式的数据,该数据湖具有多个区域,反映了数据的精炼程度:从原始数据到经过整理的数据,准备好供消费或用于分析时。这种集成还需要可被证明是安全的解决方案,例如区块链上以患者为中心的数据共享协议(智能合约),以及用于从数据源提取元数据的新颖隐私保护方法,这些数据源最初来自部分结构化或完全非结构化数据。这里介绍的工作是作为一个欧盟项目的一部分正在开展的,其最终目标是开发用于整合医疗数据的解决方案,以在欧洲实现以公民为中心的强化医疗护理和分析。