Suppr超能文献

使用自然语言处理验证电子病历中的银屑病关节炎诊断。

Validation of psoriatic arthritis diagnoses in electronic medical records using natural language processing.

机构信息

Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA.

出版信息

Semin Arthritis Rheum. 2011 Apr;40(5):413-20. doi: 10.1016/j.semarthrit.2010.05.002. Epub 2010 Aug 10.

Abstract

OBJECTIVES

To test whether data extracted from full text patient visit notes from an electronic medical record would improve the classification of psoriatic arthritis (PsA) compared with an algorithm based on codified data.

METHODS

From the >1,350,000 adults in a large academic electronic medical record, all 2318 patients with a billing code for PsA were extracted and 550 were randomly selected for chart review and algorithm training. Using codified data and phrases extracted from narrative data using natural language processing, 31 predictors were extracted and 3 random forest algorithms were trained using coded, narrative, and combined predictors. The receiver operator curve was used to identify the optimal algorithm and a cut-point was chosen to achieve the maximum sensitivity possible at a 90% positive predictive value (PPV). The algorithm was then used to classify the remaining 1768 charts and finally validated in a random sample of 300 cases predicted to have PsA.

RESULTS

The PPV of a single PsA code was 57% (95% CI 55%-58%). Using a combination of coded data and natural language processing (NLP), the random forest algorithm reached a PPV of 90% (95% CI 86%-93%) at a sensitivity of 87% (95% CI 83%-91%) in the training data. The PPV was 93% (95% CI 89%-96%) in the validation set. Adding NLP predictors to codified data increased the area under the receiver operator curve (P < 0.001).

CONCLUSIONS

Using NLP with text notes from electronic medical records improved the performance of the prediction algorithm significantly. Random forests were a useful tool to accurately classify psoriatic arthritis cases to enable epidemiological research.

摘要

目的

测试从电子病历中的完整患者就诊记录中提取的数据是否比基于编码数据的算法更能改善银屑病关节炎 (PsA) 的分类。

方法

从一个大型学术电子病历中超过 135 万成年人中,提取了所有 2318 名患有 PsA 计费代码的患者,并随机选择了 550 名进行图表审查和算法培训。使用编码数据和使用自然语言处理从叙述性数据中提取的短语,提取了 31 个预测因子,并使用编码、叙述和组合预测因子训练了 3 个随机森林算法。使用接收者操作曲线确定最佳算法,并选择一个切点,以在 90%的阳性预测值 (PPV) 下实现最大可能的灵敏度。然后,该算法用于分类其余 1768 份图表,并最终在预测患有 PsA 的 300 例随机样本中进行验证。

结果

单一 PsA 代码的 PPV 为 57%(95%CI 55%-58%)。使用编码数据和自然语言处理 (NLP) 的组合,随机森林算法在训练数据中的灵敏度为 87%(95%CI 83%-91%)时达到了 90%的 PPV(95%CI 86%-93%)。验证集的 PPV 为 93%(95%CI 89%-96%)。将 NLP 预测因子添加到编码数据中增加了接收者操作曲线下的面积(P<0.001)。

结论

使用电子病历中的自然语言处理文本记录显著提高了预测算法的性能。随机森林是一种准确分类银屑病关节炎病例的有用工具,可用于进行流行病学研究。

相似文献

引用本文的文献

2
Large language models and rheumatology: are we there yet?大语言模型与风湿病学:我们到那儿了吗?
Rheumatol Adv Pract. 2024 Sep 18;9(2):rkae119. doi: 10.1093/rap/rkae119. eCollection 2025.
10

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验