Suppr超能文献

自然语言处理系统提取个体社会风险因素的可推广性和可移植性。

Generalizability and portability of natural language processing system to extract individual social risk factors.

机构信息

College of Medicine, University of Florida, Gainesville, FL, USA.

Regenstrief Institute, Inc., Indianapolis, IN, USA; Richard M. Fairbanks School of Public Health, IUPUI, Indianapolis, IN, USA.

出版信息

Int J Med Inform. 2023 Sep;177:105115. doi: 10.1016/j.ijmedinf.2023.105115. Epub 2023 Jun 5.

Abstract

OBJECTIVE

The objective of this study is to validate and report on portability and generalizability of a Natural Language Processing (NLP) method to extract individual social factors from clinical notes, which was originally developed at a different institution.

MATERIALS AND METHODS

A rule-based deterministic state machine NLP model was developed to extract financial insecurity and housing instability using notes from one institution and was applied on all notes written during 6 months at another institution. 10% of positively-classified notes by NLP and the same number of negatively-classified notes were manually annotated. The NLP model was adjusted to accommodate notes at the new site. Accuracy, positive predictive value, sensitivity, and specificity were calculated.

RESULTS

More than 6 million notes were processed at the receiving site by the NLP model, which resulted in about 13,000 and 19,000 classified as positive for financial insecurity and housing instability, respectively. The NLP model showed excellent performance on the validation dataset with all measures over 0.87 for both social factors.

DISCUSSION

Our study illustrated the need to accommodate institution-specific note-writing templates as well as clinical terminology of emergent diseases when applying NLP model for social factors. A state machine is relatively simple to port effectively across institutions. Our study. showed superior performance to similar generalizability studies for extracting social factors.

CONCLUSION

Rule-based NLP model to extract social factors from clinical notes showed strong portability and generalizability across organizationally and geographically distinct institutions. With only relatively simple modifications, we obtained promising performance from an NLP-based model.

摘要

目的

本研究旨在验证和报告一种从临床记录中提取个体社会因素的自然语言处理(NLP)方法的可移植性和通用性,该方法最初是在不同的机构开发的。

材料和方法

开发了一种基于规则的确定性状态机 NLP 模型,用于使用一个机构的记录提取财务不安全和住房不稳定,并将其应用于另一个机构 6 个月内的所有记录。通过 NLP 分类为阳性的记录中随机抽取 10%,以及同样数量的分类为阴性的记录进行手动注释。对 NLP 模型进行了调整,以适应新站点的记录。计算了准确性、阳性预测值、敏感性和特异性。

结果

在接收站点,NLP 模型处理了超过 600 万条记录,其中约有 13000 条和 19000 条记录分别被分类为财务不安全和住房不稳定阳性。该 NLP 模型在验证数据集上表现出色,两个社会因素的所有指标均超过 0.87。

讨论

我们的研究表明,在将 NLP 模型应用于社会因素时,需要适应特定机构的记录编写模板以及新兴疾病的临床术语。状态机相对容易在机构之间有效地移植。我们的研究表明,在提取社会因素方面,该方法的性能优于类似的可推广性研究。

结论

基于规则的 NLP 模型从临床记录中提取社会因素具有较强的可移植性和通用性,可以跨越组织和地理位置不同的机构。通过相对简单的修改,我们从基于 NLP 的模型中获得了有前景的性能。

相似文献

1
Generalizability and portability of natural language processing system to extract individual social risk factors.
Int J Med Inform. 2023 Sep;177:105115. doi: 10.1016/j.ijmedinf.2023.105115. Epub 2023 Jun 5.
2
Extraction of sleep information from clinical notes of Alzheimer's disease patients using natural language processing.
J Am Med Inform Assoc. 2024 Oct 1;31(10):2217-2227. doi: 10.1093/jamia/ocae177.
3
Natural language processing-driven state machines to extract social factors from unstructured clinical documentation.
JAMIA Open. 2023 Apr 18;6(2):ooad024. doi: 10.1093/jamiaopen/ooad024. eCollection 2023 Jul.
6
Ensembles of natural language processing systems for portable phenotyping solutions.
J Biomed Inform. 2019 Dec;100:103318. doi: 10.1016/j.jbi.2019.103318. Epub 2019 Oct 23.
10

引用本文的文献

1
Performance of 4 Methods to Assess Health-Related Social Needs.
JAMA Netw Open. 2025 Aug 1;8(8):e2527426. doi: 10.1001/jamanetworkopen.2025.27426.
4
Extracting Housing and Food Insecurity Information From Clinical Notes Using cTAKES.
Health Serv Res. 2025 May;60 Suppl 3(Suppl 3):e14440. doi: 10.1111/1475-6773.14440. Epub 2025 Jan 28.
7
A cross-institutional evaluation on breast cancer phenotyping NLP algorithms on electronic health records.
Comput Struct Biotechnol J. 2023 Aug 22;22:32-40. doi: 10.1016/j.csbj.2023.08.018. eCollection 2023.

本文引用的文献

1
Natural language processing-driven state machines to extract social factors from unstructured clinical documentation.
JAMIA Open. 2023 Apr 18;6(2):ooad024. doi: 10.1093/jamiaopen/ooad024. eCollection 2023 Jul.
2
A framework for a consistent and reproducible evaluation of manual review for patient matching algorithms.
J Am Med Inform Assoc. 2022 Nov 14;29(12):2105-2109. doi: 10.1093/jamia/ocac175.
3
Interfacing With the Electronic Health Record (EHR): A Comparative Review of Modes of Documentation.
Cureus. 2022 Jun 25;14(6):e26330. doi: 10.7759/cureus.26330. eCollection 2022 Jun.
5
Systematic review of current natural language processing methods and applications in cardiology.
Heart. 2022 May 25;108(12):909-916. doi: 10.1136/heartjnl-2021-319769.
7
ReHouSED: A novel measurement of Veteran housing stability using natural language processing.
J Biomed Inform. 2021 Oct;122:103903. doi: 10.1016/j.jbi.2021.103903. Epub 2021 Aug 30.
8
Association of Silent Cerebrovascular Disease Identified Using Natural Language Processing and Future Ischemic Stroke.
Neurology. 2021 Sep 28;97(13):e1313-e1321. doi: 10.1212/WNL.0000000000012602. Epub 2021 Aug 10.
9
Adaptation of an NLP system to a new healthcare environment to identify social determinants of health.
J Biomed Inform. 2021 Aug;120:103851. doi: 10.1016/j.jbi.2021.103851. Epub 2021 Jun 24.
10
Machine Learning and Natural Language Processing in Mental Health: Systematic Review.
J Med Internet Res. 2021 May 4;23(5):e15708. doi: 10.2196/15708.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验