Perspect Health Inf Manag. 2021 Jul 1;18(3):1d. eCollection 2021 Summer.
The availability of accurate, reliable, and timely clinical data is crucial for clinicians, researchers, and policymakers so that they can respond effectively to emerging public health threats. This was typified by the recent SARS-CoV-2 pandemic and the critical knowledge and data gaps associated with novel Coronavirus 2019 disease (COVID-19).We sought to create an adaptive, living data mart containing detailed clinical, epidemiologic, and outcome data from COVID-19 patients in our healthcare system. If successful, the approach could then be used for any future outbreak or disease.
From 3/13/2020 onward, demographics, comorbidities, outpatient medications, along with 75 laboratory, 2 imaging, 19 therapeutic, and 4 outcome-related parameters, were manually extracted from the electronic medical record (EMR) of SARS-CoV-2 positive patients. These parameters were entered on a registry featuring calculation, graphing tools, pivot tables, and a macro programming language. Initially, two internal medicine residents populated the database, then professional data abstractors populated the registry. Clinical parameters were developed with input from infectious diseases and critical care physicians and using a modified COVID-19 worksheet from the U.S. Centers for Disease Control and Prevention (CDC). Registry contents were migrated to a browser-based, metadata-driven electronic data capture software platform. Eventually, we developed queries and used various business intelligence (BI) tools which enabled us to semi-automate data ingestion of 147 clinical and outcome parameters from the EMR, via a large U.S. hospital-based, service-level, all-payer database. Statistics were performed in R and Minitab.
From March 13, 2020 to May 17, 2021, 549,691 SARS-CoV-2 test results on 236,144 distinct patients, along with location, admission status, and other epidemiologic details are stored on the cloud-based BI platform. From March 2020 until May 2021, extraction of clinical-epidemiologic parameter had to be performed manually. Of those, 543 have had >/=75 parameters fully entered in the registry. Ten clinical characteristics were significantly associated with the need for hospital admission. Only one characteristic was associated with a need for ICU admission. Use of supplemental oxygen, vasopressors and outpatient statin were associated with increased mortality.Initially, 0.5hrs -1.5 hours per patient chart (approximately 450-575 person hours) were required to manually extract the parameters and populate the registry. As of May 17, 2021, semi-automated data ingestion from the U.S. hospital all-payer database, employing user-defined queries, was implemented. That process can ingest and populate the registry with 147 clinical, epidemiologic, and outcome parameters at a rate of 2 hours per 100 patient charts.
A living COVID-19 registry represents a mechanism to facilitate optimal sharing of data between providers, consumers, health information networks, and health plans through technology-enabled, secure-access electronic health information. Our approach also involves a diversity of new roles in the field, such as using residents, staff, and the quality department, in addition to professional data extractors and the health informatics team.Initially, due to the overwhelming number of infections that continues to accelerate, and the labor/time intense nature of the project, only a small fraction of all patients with COVID-19 had all parameters entered in the registry. Therefore, this report also offers lessons learned and discusses sustainability issues, should others wish to establish a registry. It also highlights the registry's local and broader public health significance. Beginning in June 2021, whole-genome sequencing results such as lineages harboring important viral mutations, or variants of concern will be linked to the clinical meta-data.
对于临床医生、研究人员和政策制定者来说,准确、可靠和及时的临床数据对于有效应对新出现的公共卫生威胁至关重要。最近的 SARS-CoV-2 大流行就是一个典型的例子,与新型冠状病毒 2019 疾病(COVID-19)相关的关键知识和数据差距也凸显了这一点。我们试图创建一个自适应的、实时的数据集市,其中包含我们医疗系统中 COVID-19 患者的详细临床、流行病学和结果数据。如果成功,这种方法可以用于任何未来的疫情或疾病。
从 2020 年 3 月 13 日起,从电子病历(EMR)中手动提取 SARS-CoV-2 阳性患者的人口统计学、合并症、门诊药物以及 75 项实验室、2 项影像学、19 项治疗和 4 项与结果相关的参数。这些参数被输入到一个具有计算、图表工具、数据透视表和宏编程语言的注册中心。最初,两名内科住院医师填充数据库,然后专业的数据提取员填充注册表。临床参数是在传染病和重症监护医生的帮助下开发的,并使用来自美国疾病控制和预防中心(CDC)的 COVID-19 工作表进行修改。注册中心的内容迁移到基于浏览器的、元数据驱动的电子数据捕获软件平台。最终,我们开发了查询并使用了各种商业智能(BI)工具,使我们能够通过美国大型医院级、服务级、全付费数据库,半自动地从 EMR 中摄取 147 项临床和结果参数。统计数据在 R 和 Minitab 中进行。
从 2020 年 3 月 13 日至 2021 年 5 月 17 日,在 236144 名不同患者的 549691 份 SARS-CoV-2 检测结果以及位置、入院状态和其他流行病学细节存储在基于云的 BI 平台上。从 2020 年 3 月到 2021 年 5 月,必须手动执行临床流行病学参数的提取。其中,543 项参数已在注册表中完全输入。有 10 项临床特征与住院需求显著相关。只有一项特征与 ICU 入院需求相关。使用补充氧气、血管加压素和门诊他汀类药物与死亡率增加有关。最初,每个患者图表需要 0.5 到 1.5 小时(大约 450 到 575 人小时)来手动提取参数并填充注册表。截至 2021 年 5 月 17 日,已实施了从美国医院全付费数据库的半自动数据摄取,使用用户定义的查询。该过程可以以每 100 个患者图表 2 小时的速度摄取和填充注册表中的 147 项临床、流行病学和结果参数。
实时 COVID-19 注册表代表了一种通过技术支持的、安全访问的电子健康信息,在提供者、消费者、健康信息网络和健康计划之间促进数据最佳共享的机制。我们的方法还涉及到该领域的新角色,例如使用住院医师、工作人员和质量部门,除了专业的数据提取员和健康信息学团队。最初,由于感染人数持续加速,数量庞大,而且项目劳动强度和时间强度大,只有一小部分 COVID-19 患者的所有参数都输入了注册表。因此,本报告还提供了经验教训,并讨论了可持续性问题,如果其他人希望建立一个注册表,也应该考虑这些问题。它还强调了注册表在当地和更广泛的公共卫生方面的意义。从 2021 年 6 月开始,将与临床元数据相关联的重要病毒突变或关注的病毒株的全基因组测序结果,如携带重要病毒突变或关注的病毒株的谱系。