Suppr超能文献

使用自然语言处理工具从电子健康记录中的非结构化笔记中提取认知障碍评估信息:与临床评估数据的验证

Extracting Cognitive Impairment Assessment Information From Unstructured Notes in Electronic Health Records Using Natural Language Processing Tools: Validation with Clinical Assessment Data.

作者信息

Wang Kuan-Yuan, Mahesri Mufaddal, Novoa-Laurentiev John, Bessette Lily G, York Cassandra, Zakoul Heidi, Lee Su Been, Ngan Kerry, Zhou Li, Kim Dae Hyun, Lin Kueiyu Joshua

机构信息

National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan, Taiwan.

Marcus Institute for Aging Research, Hebrew SeniorLife, Boston, MA, USA.

出版信息

Clin Epidemiol. 2025 Apr 15;17:353-365. doi: 10.2147/CLEP.S504259. eCollection 2025.

Abstract

PURPOSE

We aimed to develop a Natural Language Processing (NLP) algorithm to extract cognitive scores from electronic health records (EHR) data and compare them with cognitive function recorded by Centers for Medicare & Medicaid Services (CMS)-mandated clinical assessments in nursing homes and home health visits.

PATIENTS AND METHODS

We identified a cohort of Medicare beneficiaries who had either the Minimum Data Set (MDS) or Outcome and Assessment Information Set (OASIS) linked to EHR data from the Research Patient Data Registry (Mass General Brigham, Boston, MA) from 2010 to 2019. We applied an NLP approach to identify the Montreal Cognitive Assessment (MoCA) and the Mini-Mental State Examination (MMSE) scores from unstructured clinician notes in EHR. Using the NLP-extracted MoCA or MMSE scores from EHR, we compared mean differences of extracted MoCA or MMSE by cognition status determined by MDS (impaired vs intact cognition) and OASIS (severe impairment vs intact cognition) data, respectively.

RESULTS

Our study cohort had 7419 patients who had MDS (19.7%) or OASIS (80.3%) assessments, with a mean age of 80 (SD=7) years and 60% female. In EHR, the NLP algorithm extracted cognitive test scores with 97% accuracy (95% CI: 92-99%) for MoCA and 100% accuracy (95% CI: 84-100%) for MMSE. In MDS, the mean difference in extracted MoCA was -5.6 (95% CI: -8.7, -2.4, p=0.0008), and the mean difference in extracted MMSE was -7.9 (95% CI: -12.4, -3.5, p=0.0012). In OASIS, the mean difference in extracted MoCA and extracted MMSE was -4.8 (95% CI: -9.1, -0.6, p=0.0006) and -4.5 (95% CI: -9.5, -0.5, p=0.0182), respectively.

CONCLUSION

We developed an NLP algorithm to accurately extract cognitive scores from unstructured EHR, and these extracted cognitive scores were well correlated with cognition function recorded in CMS-mandated clinical assessments. This could help researchers identify patients with various degrees of cognitive impairment in EHR-based research.

摘要

目的

我们旨在开发一种自然语言处理(NLP)算法,从电子健康记录(EHR)数据中提取认知分数,并将其与医疗保险和医疗补助服务中心(CMS)规定的疗养院和家庭健康访视临床评估记录的认知功能进行比较。

患者与方法

我们确定了一组医疗保险受益人,他们在2010年至2019年期间有与来自研究患者数据登记处(马萨诸塞州波士顿市布莱根妇女医院)的EHR数据相关联的最低数据集(MDS)或结果与评估信息集(OASIS)。我们应用NLP方法从EHR中的非结构化临床记录中识别蒙特利尔认知评估(MoCA)和简易精神状态检查表(MMSE)分数。使用从EHR中通过NLP提取的MoCA或MMSE分数,我们分别比较了根据MDS(认知受损与认知正常)和OASIS(严重受损与认知正常)数据确定的认知状态下提取的MoCA或MMSE的平均差异。

结果

我们的研究队列中有7419名患者进行了MDS(19.7%)或OASIS(80.3%)评估,平均年龄为80岁(标准差=7),女性占60%。在EHR中,NLP算法提取MoCA认知测试分数的准确率为97%(95%置信区间:92-99%),提取MMSE的准确率为100%(95%置信区间:84-100%)。在MDS中,提取的MoCA平均差异为-5.6(95%置信区间:-8.7,-2.4,p=0.0008),提取的MMSE平均差异为-7.9(95%置信区间:-12.4,-3.5,p=0.0012)。在OASIS中,提取的MoCA和提取的MMSE平均差异分别为-4.8(95%置信区间:-9.1,-0.6,p=0.0006)和-4.5(95%置信区间:-9.5,-0.5,p=0.0182)。

结论

我们开发了一种NLP算法,可准确从非结构化EHR中提取认知分数,这些提取的认知分数与CMS规定的临床评估中记录的认知功能密切相关。这有助于研究人员在基于EHR的研究中识别不同程度认知障碍的患者。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验