Suppr超能文献

人工智能在非结构化医疗保健数据中的应用:以患者报告的药物不良反应编码为例。

Artificial Intelligence for Unstructured Healthcare Data: Application to Coding of Patient Reporting of Adverse Drug Reactions.

机构信息

INSERM, BPH, U1219, Team Pharmacoepidemiology, Univ. Bordeaux, Bordeaux, France.

CHU de Bordeaux, Pole de Santé Publique, Service de Pharmacologie Médicale, Centre de Pharmacovigilance de Bordeaux, Bordeaux, France.

出版信息

Clin Pharmacol Ther. 2021 Aug;110(2):392-400. doi: 10.1002/cpt.2266. Epub 2021 May 8.

Abstract

Adverse drug reaction (ADR) reporting is a major component of drug safety monitoring; its input will, however, only be optimized if systems can manage to deal with its tremendous flow of information, based primarily on unstructured text fields. The aim of this study was to develop an automated system allowing to code ADRs from patient reports. Our system was based on a knowledge base about drugs, enriched by supervised machine learning (ML) models trained on patients reporting data. To train our models, we selected all cases of ADRs reported by patients to a French Pharmacovigilance Centre through a national web-portal between March 2017 and March 2019 (n = 2,058 reports). We tested both conventional ML models and deep-learning models. We performed an external validation using a dataset constituted of a random sample of ADRs reported to the Marseille Pharmacovigilance Centre over the same period (n = 187). Here, we show that regarding area under the curve (AUC) and F-measure, the best model to identify ADRs was gradient boosting trees (LGBM), with an AUC of 0.93 (0.92-0.94) and F-measure of 0.72 (0.68-0.75). This model was run for external validation showing an AUC of 0.91 and a F-measure of 0.58. We evaluated an artificial intelligence pipeline that was found able to learn how to identify correctly ADRs from unstructured data. This result allowed us to start a new study using more data to further improve our performance and offer a tool that is useful in practice to efficiently manage drug safety information.

摘要

药物不良反应(ADR)报告是药物安全监测的主要组成部分;然而,如果系统能够处理基于非结构化文本字段的大量信息,其投入将得到优化。本研究旨在开发一种允许从患者报告中编码 ADR 的自动化系统。我们的系统基于一个关于药物的知识库,并通过基于患者报告数据的监督机器学习(ML)模型进行了增强。为了训练我们的模型,我们选择了 2017 年 3 月至 2019 年 3 月期间通过国家网络门户向法国药物警戒中心报告的所有 ADR 病例(n=2058 份报告)。我们测试了常规 ML 模型和深度学习模型。我们使用同一时期向马赛药物警戒中心报告的 ADR 随机样本数据集(n=187)进行了外部验证。在这里,我们表明,就曲线下面积(AUC)和 F 度量而言,用于识别 ADR 的最佳模型是梯度提升树(LGBM),AUC 为 0.93(0.92-0.94),F 度量为 0.72(0.68-0.75)。该模型在外部验证中表现出 AUC 为 0.91 和 F 度量为 0.58。我们评估了一个人工智能管道,该管道被发现能够从非结构化数据中学习如何正确识别 ADR。这一结果使我们能够开始一项新的研究,使用更多的数据进一步提高我们的性能,并提供一个在实践中有用的工具,以有效地管理药物安全信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1a0/8359992/3cf3301617cf/CPT-110-392-g002.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验