利用电子健康记录数据和自然语言处理将患有风湿性疾病的个体分类为经济不安全人群：算法推导与验证

Classifying Individuals With Rheumatic Conditions as Financially Insecure Using Electronic Health Record Data and Natural Language Processing: Algorithm Derivation and Validation.

作者信息

Chandler Mia T, Cai Tianrun, Santacroce Leah, Ulysse Sciaska, Liao Katherine P, Feldman Candace H

机构信息

Boston Children's Hospital, Boston, Massachusetts.

Harvard Medical School, Boston, Massachusetts.

出版信息

ACR Open Rheumatol. 2024 Aug;6(8):481-488. doi: 10.1002/acr2.11675. Epub 2024 May 15.

DOI:10.1002/acr2.11675

PMID:38747148

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11319925/

Abstract

OBJECTIVE

We aimed to examine the feasibility of applying natural language processing (NLP) to unstructured electronic health record (EHR) documents to detect the presence of financial insecurity among patients with rheumatologic disease enrolled in an integrated care management program (iCMP).

METHODS

We incorporated supervised, rule-based NLP and statistical methods to identify financial insecurity among patients with rheumatic conditions enrolled in an iCMP (n = 20,395) in a multihospital EHR system. We constructed a lexicon for financial insecurity using data from available knowledge sources and then reviewed EHR notes from 538 randomly selected individuals (training cohort n = 366, validation cohort n = 172). We manually categorized records as having "definite," "possible," or "no" mention of financial insecurity. All available notes were processed using Narrative Information Linear Extraction, a rule-based version of NLP. Models were trained using the NLP features for financial insecurity using logistic, least absolute shrinkage operator (LASSO), and random forest performance characteristic and were compared with the reference standard.

RESULTS

A total of 245,142 notes were processed from 538 individual patient records. Financial insecurity was present among 100 (27%) individuals in the training cohort and 63 (37%) in the validation cohort. The LASSO and random forest models performed identically and slightly better than logistic regression, with positive predictive values of 0.90, sensitivities of 0.29, and specificities of 0.98.

CONCLUSION

The development of a context-driven lexicon used with rule-based NLP to extract data that identify financial insecurity is feasible for use and improved the capture for presence of financial insecurity with high accuracy. In the absence of a standard lexicon and construct definition for financial insecurity status, additional studies are needed to optimize the sensitivity of algorithms to categorize financial insecurity with construct validity.

摘要

目的

我们旨在研究将自然语言处理（NLP）应用于非结构化电子健康记录（EHR）文档，以检测参与综合护理管理计划（iCMP）的风湿病患者中存在财务不安全状况的可行性。

方法

我们采用了监督式、基于规则的NLP和统计方法，以识别多医院EHR系统中参与iCMP的风湿病患者（n = 20,395）中的财务不安全状况。我们利用现有知识源的数据构建了一个财务不安全状况的词汇表，然后审查了538名随机选择个体的EHR记录（训练队列n = 366，验证队列n = 172）。我们将记录手动分类为有“明确”、“可能”或“未提及”财务不安全状况。所有可用记录均使用基于规则的NLP版本——叙事信息线性提取进行处理。使用财务不安全状况的NLP特征，通过逻辑回归、最小绝对收缩算子（LASSO）和随机森林性能特征对模型进行训练，并与参考标准进行比较。

结果

共处理了538份个体患者记录中的245,142条记录。训练队列中有100名（27%）个体存在财务不安全状况，验证队列中有63名（37%）。LASSO和随机森林模型表现相同，且略优于逻辑回归，阳性预测值为0.90，敏感性为0.29，特异性为0.98。

结论

开发一个与基于规则的NLP一起使用的上下文驱动词汇表，以提取识别财务不安全状况的数据是可行的，并且能够以高精度改进对财务不安全状况存在情况的捕捉。在缺乏财务不安全状况的标准词汇表和结构定义的情况下，需要进行更多研究以优化算法的敏感性，从而使财务不安全状况的分类具有结构效度。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

利用电子健康记录数据和自然语言处理将患有风湿性疾病的个体分类为经济不安全人群：算法推导与验证

Classifying Individuals With Rheumatic Conditions as Financially Insecure Using Electronic Health Record Data and Natural Language Processing: Algorithm Derivation and Validation.

作者信息

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

利用电子健康记录数据和自然语言处理将患有风湿性疾病的个体分类为经济不安全人群：算法推导与验证

Classifying Individuals With Rheumatic Conditions as Financially Insecure Using Electronic Health Record Data and Natural Language Processing: Algorithm Derivation and Validation.

作者信息

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献