Hospital Universitario 12 de Octubre, Av. de Córdoba, s/n, 28041 Madrid, Spain; ETSI Telecomunicación, Universidad Politécnica de Madrid, 28040 Madrid, Spain.
Hospital Universitario 12 de Octubre, Av. de Córdoba, s/n, 28041 Madrid, Spain.
J Biomed Inform. 2021 Mar;115:103697. doi: 10.1016/j.jbi.2021.103697. Epub 2021 Feb 3.
COVID-19 ranks as the single largest health incident worldwide in decades. In such a scenario, electronic health records (EHRs) should provide a timely response to healthcare needs and to data uses that go beyond direct medical care and are known as secondary uses, which include biomedical research. However, it is usual for each data analysis initiative to define its own information model in line with its requirements. These specifications share clinical concepts, but differ in format and recording criteria, something that creates data entry redundancy in multiple electronic data capture systems (EDCs) with the consequent investment of effort and time by the organization.
This study sought to design and implement a flexible methodology based on detailed clinical models (DCM), which would enable EHRs generated in a tertiary hospital to be effectively reused without loss of meaning and within a short time.
The proposed methodology comprises four stages: (1) specification of an initial set of relevant variables for COVID-19; (2) modeling and formalization of clinical concepts using ISO 13606 standard and SNOMED CT and LOINC terminologies; (3) definition of transformation rules to generate secondary use models from standardized EHRs and development of them using R language; and (4) implementation and validation of the methodology through the generation of the International Severe Acute Respiratory and emerging Infection Consortium (ISARIC-WHO) COVID-19 case report form. This process has been implemented into a 1300-bed tertiary Hospital for a cohort of 4489 patients hospitalized from 25 February 2020 to 10 September 2020.
An initial and expandable set of relevant concepts for COVID-19 was identified, modeled and formalized using ISO-13606 standard and SNOMED CT and LOINC terminologies. Similarly, an algorithm was designed and implemented with R and then applied to process EHRs in accordance with standardized concepts, transforming them into secondary use models. Lastly, these resources were applied to obtain a data extract conforming to the ISARIC-WHO COVID-19 case report form, without requiring manual data collection. The methodology allowed obtaining the observation domain of this model with a coverage of over 85% of patients in the majority of concepts.
This study has furnished a solution to the difficulty of rapidly and efficiently obtaining EHR-derived data for secondary use in COVID-19, capable of adapting to changes in data specifications and applicable to other organizations and other health conditions. The conclusion to be drawn from this initial validation is that this DCM-based methodology allows the effective reuse of EHRs generated in a tertiary Hospital during COVID-19 pandemic, with no additional effort or time for the organization and with a greater data scope than that yielded by conventional manual data collection process in ad-hoc EDCs.
COVID-19 是几十年来全球范围内最大的单一卫生事件。在这种情况下,电子健康记录(EHR)应该能够及时响应医疗需求和超出直接医疗护理的用途,即二次使用,包括生物医学研究。然而,每个数据分析计划通常都需要根据其要求定义自己的信息模型。这些规范共享临床概念,但在格式和记录标准上有所不同,这导致在多个电子数据采集系统(EDC)中重复输入数据,从而给组织带来额外的工作和时间投入。
本研究旨在设计和实施一种基于详细临床模型(DCM)的灵活方法,以便在短时间内有效地重用在三级医院生成的 EHR,而不会丢失意义。
所提出的方法包括四个阶段:(1)为 COVID-19 指定一组初始相关变量;(2)使用 ISO 13606 标准和 SNOMED CT 和 LOINC 术语对临床概念进行建模和形式化;(3)定义从标准化 EHR 生成二次使用模型的转换规则,并使用 R 语言开发它们;(4)通过生成国际严重急性呼吸和新兴感染协会(ISARIC-WHO)COVID-19 病例报告表来实施和验证该方法。该过程已在一家拥有 1300 张床位的三级医院实施,对 2020 年 2 月 25 日至 2020 年 9 月 10 日期间住院的 4489 名患者进行了研究。
确定、建模和形式化了用于 COVID-19 的初始和可扩展的相关概念集,使用 ISO-13606 标准和 SNOMED CT 和 LOINC 术语。同样,设计并实现了一个使用 R 语言的算法,然后将其应用于根据标准化概念处理 EHR,将其转换为二次使用模型。最后,这些资源用于根据 ISARIC-WHO COVID-19 病例报告表获取符合要求的数据提取,而无需手动数据收集。该方法允许从 EHR 中获得该模型的观察域,在大多数概念中,患者的覆盖率超过 85%。
本研究解决了快速高效获取 COVID-19 二次使用的 EHR 衍生数据的难题,能够适应数据规范的变化,并适用于其他组织和其他健康状况。从最初的验证中得出的结论是,这种基于 DCM 的方法允许在 COVID-19 大流行期间有效重用三级医院生成的 EHR,无需组织额外的工作和时间,并且数据范围比在特定 EDC 中进行常规手动数据收集过程更大。