Department of Emergency Medicine, University of Nebraska Medical Center, Omaha, Nebraska, USA.
Department of Pathology and Microbiology, University of Nebraska Medical Center, Omaha, Nebraska, USA.
J Am Med Inform Assoc. 2019 Apr 1;26(4):286-293. doi: 10.1093/jamia/ocy172.
Clinical research data warehouses are largely populated from information extracted from electronic health records (EHRs). While these data provide information about a patient's medications, laboratory results, diagnoses, and history, her social, economic, and environmental determinants of health are also major contributing factors in readmission, morbidity, and mortality and are often absent or unstructured in the EHR. Details about a patient's socioeconomic status may be found in the U.S. census. To facilitate researching the impacts of socioeconomic status on health outcomes, clinical and socioeconomic data must be linked in a repository in a fashion that supports seamless interrogation of these diverse data elements. This study demonstrates a method for linking clinical and location-based data and querying these data in a de-identified data warehouse using Informatics for Integrating Biology and the Bedside.
Patient data were extracted from the EHR at Nebraska Medicine. Socioeconomic variables originated from the 2011-2015 five-year block group estimates from the American Community Survey. Data querying was performed using Informatics for Integrating Biology and the Bedside. All location-based data were truncated to prevent identification of a location with a population <20 000 individuals.
We successfully linked location-based and clinical data in a de-identified data warehouse and demonstrated its utility with a sample use case.
With location-based data available for querying, research investigating the impact of socioeconomic context on health outcomes is possible. Efforts to improve geocoding can readily be incorporated into this model.
This study demonstrates a means for incorporating and querying census data in a de-identified clinical data warehouse.
临床研究数据仓库主要由从电子健康记录(EHR)中提取的信息填充。虽然这些数据提供了有关患者用药、实验室结果、诊断和病史的信息,但她的社会、经济和环境健康决定因素也是再入院、发病率和死亡率的主要因素,并且在 EHR 中通常缺失或未结构化。有关患者社会经济状况的详细信息可能在美国人口普查中找到。为了便于研究社会经济地位对健康结果的影响,必须以支持对这些不同数据元素进行无缝查询的方式在存储库中链接临床和社会经济数据。本研究展示了一种链接临床和基于位置的数据的方法,并使用 Informatics for Integrating Biology and the Bedside 在去识别数据仓库中查询这些数据。
患者数据从内布拉斯加医学的 EHR 中提取。社会经济变量源自 2011-2015 年美国社区调查的五年街区组估计。使用 Informatics for Integrating Biology and the Bedside 进行数据查询。所有基于位置的数据都被截断,以防止识别人口<20000 人的位置。
我们成功地在去识别数据仓库中链接了基于位置和临床数据,并通过一个示例用例展示了其实用性。
有了可用于查询的基于位置的数据,就可以进行研究社会经济背景对健康结果的影响。可以很容易地将改进地理编码的工作纳入到这个模型中。
本研究展示了一种在去识别临床数据仓库中纳入和查询人口普查数据的方法。