Center for Population Health Information Technology, Department of Health Policy and Management, Bloomberg School of Public Health.
Division of Health Sciences and Informatics, Department of General Internal Medicine, University School of Medicine, Johns Hopkins University, Baltimore, Maryland.
J Am Geriatr Soc. 2018 Aug;66(8):1499-1507. doi: 10.1111/jgs.15411. Epub 2018 Jul 4.
To examine the value of unstructured electronic health record (EHR) data (free-text notes) in identifying a set of geriatric syndromes.
Retrospective analysis of unstructured EHR notes using a natural language processing (NLP) algorithm.
Large multispecialty group.
Older adults (N=18,341; average age 75.9, 58.9% female).
We compared the number of geriatric syndrome cases identified using structured claims and structured and unstructured EHR data. We also calculated these rates using a population-level claims database as a reference and identified comparable epidemiological rates in peer-reviewed literature as a benchmark.
Using insurance claims data resulted in a geriatric syndrome prevalence ranging from 0.03% for lack of social support to 8.3% for walking difficulty. Using structured EHR data resulted in similar prevalence rates, ranging from 0.03% for malnutrition to 7.85% for walking difficulty. Incorporating unstructured EHR notes, enabled by applying the NLP algorithm, identified considerably higher rates of geriatric syndromes: absence of fecal control (2.1%, 2.3 times as much as structured claims and EHR data combined), decubitus ulcer (1.4%, 1.7 times as much), dementia (6.7%, 1.5 times as much), falls (23.6%, 3.2 times as much), malnutrition (2.5%, 18.0 times as much), lack of social support (29.8%, 455.9 times as much), urinary retention (4.2%, 3.9 times as much), vision impairment (6.2%, 7.4 times as much), weight loss (19.2%, 2.9 as much), and walking difficulty (36.34%, 3.4 as much). The geriatric syndrome rates extracted from structured data were substantially lower than published epidemiological rates, although adding the NLP results considerably closed this gap.
Claims and structured EHR data give an incomplete picture of burden related to geriatric syndromes. Geriatric syndromes are likely to be missed if unstructured data are not analyzed. Pragmatic NLP algorithms can assist with identifying individuals at high risk of experiencing geriatric syndromes and improving coordination of care for older adults.
探讨非结构化电子健康记录(EHR)数据(自由文本注释)在识别一组老年综合征中的价值。
使用自然语言处理(NLP)算法对非结构化 EHR 注释进行回顾性分析。
大型多专科小组。
老年人(N=18341;平均年龄 75.9,58.9%为女性)。
我们比较了使用结构化索赔和结构化及非结构化 EHR 数据识别老年综合征病例的数量。我们还使用人群级别的索赔数据库计算了这些比率,并将可比较的流行病学比率作为基准在同行评审文献中确定。
使用保险索赔数据导致老年综合征的患病率从缺乏社会支持的 0.03%到行动困难的 8.3%不等。使用结构化 EHR 数据得出的患病率相似,从营养不良的 0.03%到行动困难的 7.85%不等。通过应用 NLP 算法纳入非结构化 EHR 注释,可以识别出更高比例的老年综合征:粪便控制缺失(2.1%,是结构化索赔和 EHR 数据总和的 2.3 倍)、褥疮(1.4%,是结构化索赔和 EHR 数据总和的 1.7 倍)、痴呆(6.7%,是结构化索赔和 EHR 数据总和的 1.5 倍)、跌倒(23.6%,是结构化索赔和 EHR 数据总和的 3.2 倍)、营养不良(2.5%,是结构化索赔和 EHR 数据总和的 18.0 倍)、缺乏社会支持(29.8%,是结构化索赔和 EHR 数据总和的 455.9 倍)、尿潴留(4.2%,是结构化索赔和 EHR 数据总和的 3.9 倍)、视力障碍(6.2%,是结构化索赔和 EHR 数据总和的 7.4 倍)、体重减轻(19.2%,是结构化索赔和 EHR 数据总和的 2.9 倍)和行动困难(36.34%,是结构化索赔和 EHR 数据总和的 3.4 倍)。从结构化数据中提取的老年综合征率明显低于已发表的流行病学率,尽管加入 NLP 结果大大缩小了这一差距。
索赔和结构化 EHR 数据提供了有关老年综合征相关负担的不完整信息。如果不分析非结构化数据,老年综合征可能会被遗漏。实用的 NLP 算法可以帮助识别高风险的老年综合征患者,并改善对老年人的护理协调。