Department of Radiation Oncology, Sun Yat-sen University Cancer Center; State Key Laboratory of Oncology in South China; Collaborative Innovation Center for Cancer Medicine; Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangzhou, China.
YiduCloud Technology Ltd, Beijing, China.
Br J Radiol. 2019 Oct;92(1102):20190255. doi: 10.1259/bjr.20190255. Epub 2019 Aug 20.
To develop a big data intelligence platform for secondary use of electronic health records (EHRs) data to facilitate research for nasopharyngeal cancer (NPC).
This project was launched in 2015 and carried out by the cooperation of an academic cancer centre and a technology company. Patients diagnosed with NPC at Sun Yat-sen University Cancer Centre since January 2008 were included in the platform. Standard data elements were established to defined 981 variables for the platform. For each patient, data from 13 EHRs systems were extracted, integrated, structurized and normalized. Eight functional modules were constructed for the platform to facilitate the investigators to identify eligible patients, establish research projects, conduct statistical analysis, track the follow-up, search literature, etc.
From January 2008 to December 2018, 54,703 patients diagnosed with NPC were included. Of these patients, 39,058 (71.4%) were male, and 15,645 (28.6%) were female; median age was 47 (interquartile range, 39-55) years. Of 981 variables, 341 were obtained from data structurization and normalization, of which 68 were generated by interacting multiple data sources via well-defined logical rules. The average precision rate, recall rate and -measure for 341 variables were 0.97 ± 0.024, 0.92 ± 0.030, and 0.94 ± 0.027 respectively. The platform is regularly updated every seven days to include new patients and add new data for existing patients. Up to now, eight big data-driven retrospective studies have been published from the platform.
Our big data intelligence platform demonstrates the feasibility of integrating EHRs data of routine healthcare, and offers an important perspective on real-world study of NPC. The continued efforts may be focus on data sharing among multiple hospitals and publicly releasing of data files.
Our big data intelligence platform is the first disease-specific data platform for NPC research. It incorporates comprehensive EHRs data from routine healthcare, which can facilitate real-world study of NPC in risk stratification, decision-making and comorbidities management.
开发电子健康记录(EHR)数据二次利用的大数据智能平台,以促进鼻咽癌(NPC)的研究。
该项目于 2015 年启动,由一家学术癌症中心和一家技术公司合作开展。自 2008 年 1 月以来,在中山大学肿瘤防治中心诊断为 NPC 的患者被纳入该平台。建立了标准数据元素,为平台定义了 981 个变量。对于每个患者,从 13 个 EHR 系统中提取、整合、结构化和规范化数据。该平台构建了 8 个功能模块,方便研究人员识别合格患者、建立研究项目、进行统计分析、跟踪随访、搜索文献等。
2008 年 1 月至 2018 年 12 月,共有 54703 例 NPC 患者纳入该平台。其中,39058 例(71.4%)为男性,15645 例(28.6%)为女性;中位年龄为 47 岁(四分位距 39-55 岁)。在 981 个变量中,341 个通过数据结构化和规范化获得,其中 68 个通过定义明确的逻辑规则与多个数据源交互生成。341 个变量的平均精度率、召回率和 F1 测度分别为 0.97±0.024、0.92±0.030 和 0.94±0.027。该平台每 7 天定期更新,以纳入新患者并为现有患者添加新数据。截至目前,该平台已发表了 8 项基于大数据的回顾性研究。
我们的大数据智能平台证明了整合常规医疗保健中的 EHR 数据的可行性,为 NPC 的真实世界研究提供了一个重要视角。持续的努力可能集中在多家医院的数据共享和公开发布数据文件上。
我们的大数据智能平台是首个针对 NPC 研究的特定疾病数据平台。它整合了常规医疗保健中的综合 EHR 数据,可促进 NPC 的真实世界研究在风险分层、决策和合并症管理方面的研究。