Montvida Olga, Dibato John Epoh, Paul Sanjoy
Melbourne EpiCentre, University of Melbourne, Melbourne, Australia.
JMIR Med Inform. 2020 Jun 3;8(6):e17174. doi: 10.2196/17174.
Electronic medical record (EMR)-based clinical and epidemiological research has dramatically increased over the last decade, although establishing the generalizability of such big databases for conducting epidemiological studies has been an ongoing challenge. To draw meaningful inferences from such studies, it is essential to fully understand the characteristics of the underlying population and potential biases in EMRs.
This study aimed to assess the generalizability and representativity of the widely used US Centricity Electronic Medical Record (CEMR), a primary and ambulatory care EMR for population health research, using data from the National Ambulatory Medical Care Surveys (NAMCS) and the National Health and Nutrition Examination Surveys (NHANES).
The number of office visits reported in the NAMCS, designed to meet the need for objective and reliable information about the provision and the use of ambulatory medical care services, was compared with similar data from the CEMR. The distribution of major cardiometabolic diseases in the NHANES, designed to assess the health and nutritional status of adults and children in the United States, was compared with similar data from the CEMR.
Gender and ethnicity distributions were similar between the NAMCS and the CEMR. Younger patients (aged <15 years) were underrepresented in the CEMR compared with the NAMCS. The number of office visits per 100 persons per year was similar: 277.9 (95% CI 259.3-296.5) in the NAMCS and 284.6 (95% CI 284.4-284.7) in the CEMR. However, the number of visits for males was significantly higher in the CEMR (CEMR: 270.8 and NAMCS: 239.0). West and South regions were underrepresented and overrepresented, respectively, in the CEMR. The overall prevalence of diabetes along with age and gender distribution was similar in the CEMR and the NHANES: overall prevalence, 10.1% and 9.7%; male, 11.5% and 10.8%; female, 9.1% and 8.8%; age 20 to 40 years, 2.5% and 1.8%; and age 40 to 60 years, 9.4% and 11.1%, respectively. The prevalence of obesity was similar: 42.1% and 39.6%, with similar age and female distribution (41.5% and 41.1%) but different male distribution (42.7% and 37.9%). The overall prevalence of high cholesterol along with age and female distribution was similar in the CEMR and the NHANES: overall prevalence, 12.4% and 12.4%; and female, 14.8% and 13.2%, respectively. The overall prevalence of hypertension was significantly higher in the CEMR (33.5%) than in the NHANES (95% CI: 27.0%-31.0%).
The distribution of major cardiometabolic diseases in the CEMR is comparable with the national survey results. The CEMR represents the general US population well in terms of office visits and major chronic conditions, whereas the potential subgroup differences in terms of age and gender distribution and prevalence may differ and, therefore, should be carefully taken care of in future studies.
在过去十年中,基于电子病历(EMR)的临床和流行病学研究显著增加,尽管确定此类大型数据库用于进行流行病学研究的普遍性一直是一项持续的挑战。为了从此类研究中得出有意义的推论,充分了解基础人群的特征以及电子病历中的潜在偏差至关重要。
本研究旨在使用来自国家门诊医疗调查(NAMCS)和国家健康与营养检查调查(NHANES)的数据,评估广泛使用的美国Centricity电子病历(CEMR)的普遍性和代表性,CEMR是用于人群健康研究的主要门诊电子病历。
将NAMCS中报告的门诊就诊次数(旨在满足对门诊医疗服务提供和使用的客观可靠信息的需求)与CEMR中的类似数据进行比较。将NHANES中主要心血管代谢疾病的分布(旨在评估美国成年人和儿童的健康和营养状况)与CEMR中的类似数据进行比较。
NAMCS和CEMR之间的性别和种族分布相似。与NAMCS相比,CEMR中年轻患者(年龄<15岁)的比例较低。每年每100人的门诊就诊次数相似:NAMCS为277.9次(95%CI 259.3 - 296.5),CEMR为284.6次(95%CI 284.4 - 284.7)。然而,CEMR中男性的就诊次数显著更高(CEMR:270.8次,NAMCS:239.0次)。CEMR中西方和南方地区的比例分别较低和较高。CEMR和NHANES中糖尿病的总体患病率以及年龄和性别分布相似:总体患病率分别为10.1%和9.7%;男性分别为11.5%和10.8%;女性分别为9.1%和8.8%;20至40岁年龄组分别为2.5%和1.8%;40至60岁年龄组分别为9.4%和11.1%。肥胖患病率相似:分别为42.1%和39.6%,年龄和女性分布相似(分别为41.5%和41.1%),但男性分布不同(分别为42.7%和37.9%)。CEMR和NHANES中高胆固醇的总体患病率以及年龄和女性分布相似:总体患病率分别为12.4%和12.4%;女性分别为14.8%和13.2%。CEMR中高血压的总体患病率显著高于NHANES(95%CI:27.0% - 31.0%)。
CEMR中主要心血管代谢疾病的分布与全国调查结果相当。CEMR在门诊就诊和主要慢性病方面能很好地代表美国总体人群,而在年龄和性别分布及患病率方面可能存在潜在的亚组差异,因此在未来研究中应予以仔细关注。