利用机器学习自然语言处理实现青霉素药物不良反应分类和风险分层的自动化。

BACKGROUND: The penicillin adverse drug reaction (ADR) label is common in electronic health records (EHRs). However, there is significant misclassification between allergy and intolerance within the EHR and most patients can be delabelled after an immunologic assessment. Machine learning natural language processing may be able to assist with the categorisation and risk stratification of penicillin ADRs. OBJECTIVE: The aim of this study was to use text entered into an EHR to derive and evaluate machine learning models to classify penicillin ADRs and assess the risk of true allergy. METHODS: Machine learning natural language processing was applied to free-text penicillin ADR data extracted from a public health system EHR. The model was developed by training on labelled dataset. ADR entries were split into training and testing datasets and used to develop and test a variety of machine learning models. These were compared to categorisation with a simple algorithm using keyword search. RESULTS: The best performing model for the classification of penicillin ADRs as being consistent with allergy or intolerance was the artificial neural network (AUC 0.994, sensitivity 0.99, specificity 0.96). The artificial neural network also achieved the highest AUC in the classification of high- or low-risk of true allergy (AUC 0.988, sensitivity 0.99, specificity 0.99). All ADR labels were able to be classified using these machine learning models, whereas a small proportion were unclassifiable using the simple algorithm as they contained no keywords. CONCLUSION: Machine learning natural language processing performed similarly to expert criteria in classifying and risk stratifying penicillin ADRs labels. These models outperformed simpler algorithms in their ability to interpret free-text data contained in the EHR. The automated evaluation of penicillin ADR labels may allow real-time risk stratification to facilitate delabelling and improve the specificity of prescribing alerts.

背景：青霉素药物不良反应 (ADR) 标签在电子健康记录 (EHR) 中很常见。然而，EHR 中过敏和不耐受之间存在显著的分类错误，大多数患者在免疫评估后可以去除标签。机器学习自然语言处理可能有助于青霉素 ADR 的分类和风险分层。

目的：本研究旨在使用输入到 EHR 中的文本，开发和评估机器学习模型，以分类青霉素 ADR，并评估真正过敏的风险。

方法：将机器学习自然语言处理应用于从公共卫生系统 EHR 中提取的免费文本青霉素 ADR 数据。该模型通过在标记数据集上进行训练来开发。ADR 条目被分为训练和测试数据集，并用于开发和测试各种机器学习模型。这些模型与使用关键字搜索的简单算法进行了分类比较。

结果：用于分类青霉素 ADR 与过敏或不耐受一致的最佳模型是人工神经网络 (AUC 0.994，灵敏度 0.99，特异性 0.96)。人工神经网络在分类真正过敏的高或低风险方面也获得了最高的 AUC (AUC 0.988，灵敏度 0.99，特异性 0.99)。所有 ADR 标签都可以使用这些机器学习模型进行分类，而使用简单算法时，由于它们不包含任何关键字，因此一小部分标签无法分类。

结论：机器学习自然语言处理在分类和风险分层青霉素 ADR 标签方面与专家标准表现相似。这些模型在解释 EHR 中包含的自由文本数据方面优于简单算法，具有更高的能力。青霉素 ADR 标签的自动评估可能允许实时风险分层，以促进去标签化并提高处方警报的特异性。

新学期，新优惠

Suppr 超能文献

新学期，新优惠

Suppr 超能文献

Automation of penicillin adverse drug reaction categorisation and risk stratification with machine learning natural language processing.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

推荐工具