Suppr超能文献

利用自然语言处理技术在电子健康记录中识别不同的晶状体病变。

Using Natural Language Processing to Identify Different Lens Pathology in Electronic Health Records.

机构信息

From the W.K. Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, Michigan, USA (J.D.S., Y.Z., C.A.A., J.B.); Department of Health Management and Policy, University of Michigan School of Public Health, Ann Arbor, Michigan, USA (J.D.S.).

From the W.K. Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, Michigan, USA (J.D.S., Y.Z., C.A.A., J.B.).

出版信息

Am J Ophthalmol. 2024 Jun;262:153-160. doi: 10.1016/j.ajo.2024.01.030. Epub 2024 Feb 1.

Abstract

PURPOSE

Nearly all published ophthalmology-related Big Data studies rely exclusively on International Classification of Diseases (ICD) billing codes to identify patients with particular ocular conditions. However, inaccurate or nonspecific codes may be used. We assessed whether natural language processing (NLP), as an alternative approach, could more accurately identify lens pathology.

DESIGN

Database study comparing the accuracy of NLP versus ICD billing codes to properly identify lens pathology.

METHODS

We developed an NLP algorithm capable of searching free-text lens exam data in the electronic health record (EHR) to identify the type(s) of cataract present, cataract density, presence of intraocular lenses, and other lens pathology. We applied our algorithm to 17.5 million lens exam records in the Sight Outcomes Research Collaborative (SOURCE) repository. We selected 4314 unique lens-exam entries and asked 11 clinicians to assess whether all pathology present in the entries had been correctly identified in the NLP algorithm output. The algorithm's sensitivity at accurately identifying lens pathology was compared with that of the ICD codes.

RESULTS

The NLP algorithm correctly identified all lens pathology present in 4104 of the 4314 lens-exam entries (95.1%). For less common lens pathology, algorithm findings were corroborated by reviewing clinicians for 100% of mentions of pseudoexfoliation material and 99.7% for phimosis, subluxation, and synechia. Sensitivity at identifying lens pathology was better for NLP (0.98 [0.96-0.99] than for billing codes (0.49 [0.46-0.53]).

CONCLUSIONS

Our NLP algorithm identifies and classifies lens abnormalities routinely documented by eye-care professionals with high accuracy. Such algorithms will help researchers to properly identify and classify ocular pathology, broadening the scope of feasible research using real-world data.

摘要

目的

几乎所有已发表的眼科相关大数据研究都仅依赖国际疾病分类(ICD)计费代码来识别患有特定眼部疾病的患者。然而,计费代码可能存在不准确或不明确的情况。我们评估了自然语言处理(NLP)作为替代方法是否可以更准确地识别晶状体病变。

设计

比较 NLP 与 ICD 计费代码准确性以正确识别晶状体病变的数据库研究。

方法

我们开发了一种 NLP 算法,能够在电子健康记录(EHR)中搜索自由文本晶状体检查数据,以识别存在的白内障类型、白内障密度、人工晶状体的存在以及其他晶状体病变。我们将我们的算法应用于 SOURCE 存储库中的 1750 万份晶状体检查记录。我们选择了 4314 个独特的晶状体检查条目,并要求 11 名临床医生评估条目内的所有病理是否都在 NLP 算法输出中得到正确识别。比较了算法识别晶状体病变的准确性与 ICD 代码的准确性。

结果

NLP 算法正确识别了 4314 个晶状体检查条目中的 4104 个(95.1%)存在的所有晶状体病变。对于不太常见的晶状体病变,对于假剥脱物质的提及,算法结果得到了 100%的临床医生的证实,对于 99.7%的病例,对于膜性外翻、脱位和粘连的提及,算法结果也得到了证实。识别晶状体病变的敏感性方面,NLP(0.98 [0.96-0.99])优于计费代码(0.49 [0.46-0.53])。

结论

我们的 NLP 算法以高精度识别和分类眼科医生常规记录的晶状体异常。此类算法将帮助研究人员正确识别和分类眼部病变,扩大使用真实世界数据进行可行研究的范围。

相似文献

本文引用的文献

6
Prevalence of pediatric eye disease in the optumlabs data warehouse.Optumlabs 数据仓库中儿科眼病的患病率。
Ophthalmic Epidemiol. 2022 Oct;29(5):537-544. doi: 10.1080/09286586.2021.1971261. Epub 2021 Aug 29.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验