Rasmussen-Torvik Laura J, Furmanchuk Al'ona, Stoddard Alexander J, Osinski Kristen I, Meurer John R, Smith Nicholas, Chrischilles Elizabeth, Black Bernard S, Kho Abel
Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611.
Center for Health Information Partnerships, Northwestern University Feinberg School of Medicine, Chicago, IL 60611.
Int J Popul Data Sci. 2020;5(1). doi: 10.23889/ijpds.v5i1.1156. Epub 2020 Apr 2.
Few studies have addressed how to select a study sample when using electronic health record (EHR) data.
To examine how changing criterion for number of visits in EHR data required for inclusion in a study sample would impact one basic epidemiologic measure: estimates of disease period prevalence.
Year 2016 EHR data from three Midwestern health systems (Northwestern Medicine in Illinois, University of Iowa Health Care, and Froedtert & the Medical College of Wisconsin, all regional tertiary health care systems including hospitals and clinics) was used to examine how alternate definitions of the study sample, based on number of healthcare visits in one year, affected measures of disease period prevalence. In 2016, each of these health systems saw between 160,000 and 420,000 unique patients. Curated collections of ICD-9, ICD-10, and SNOMED codes (from CMS-approved electronic clinical quality measures) were used to define three diseases: acute myocardial infarction, asthma, and diabetic nephropathy).
Across all health systems, increasing the minimum required number of visits to be included in the study sample monotonically increased crude period prevalence estimates. The rate at which prevalence estimates increased with number of visits varied across sites and across diseases.
In addition to providing thorough descriptions of case definitions, when using EHR data authors must carefully describe how a study sample is identified and report data for a range of sample definitions, including minimum number of visits, so that others can assess the sensitivity of reported results to sample definition in EHR data.
很少有研究探讨在使用电子健康记录(EHR)数据时如何选择研究样本。
研究改变纳入研究样本所需的EHR数据就诊次数标准如何影响一项基本的流行病学指标:疾病期间患病率估计值。
使用来自三个中西部医疗系统(伊利诺伊州的西北医学中心、爱荷华大学医疗保健中心以及弗罗伊德尔特与威斯康星医学院,均为包括医院和诊所的地区三级医疗保健系统)的2016年EHR数据,来检验基于一年中医疗就诊次数的研究样本替代定义如何影响疾病期间患病率的测量。2016年,这些医疗系统中的每个系统都接待了160,000至420,000名不同的患者。使用经过整理的ICD - 9、ICD - 10和SNOMED代码集(来自CMS批准的电子临床质量指标)来定义三种疾病:急性心肌梗死、哮喘和糖尿病肾病)。
在所有医疗系统中,增加纳入研究样本所需的最低就诊次数会单调增加粗期间患病率估计值。患病率估计值随就诊次数增加的速率在不同地点和不同疾病之间有所不同。
除了对病例定义进行详尽描述外,在使用EHR数据时,作者必须仔细描述如何确定研究样本,并报告一系列样本定义的数据,包括最低就诊次数,以便其他人能够评估EHR数据中报告结果对样本定义的敏感性。