Suppr超能文献

电子健康记录中药物不良反应的类别不平衡问题检测。

The class imbalance problem detecting adverse drug reactions in electronic health records.

机构信息

IXA Group, University of the Basque Country (UPV-EHU), Spain.

出版信息

Health Informatics J. 2019 Dec;25(4):1768-1778. doi: 10.1177/1460458218799470. Epub 2018 Sep 19.

Abstract

This work focuses on adverse drug reaction extraction tackling the class imbalance problem. Adverse drug reactions are infrequent events in electronic health records, nevertheless, it is compulsory to get them documented. Text mining techniques can help to retrieve this kind of valuable information from text. The class imbalance was tackled using different sampling methods, cost-sensitive learning, ensemble learning and one-class classification and the Random Forest classifier was used. The adverse drug reaction extraction model was inferred from a dataset that comprises real electronic health records with an imbalance ratio of 1:222, this means that for each drug-disease pair that is an adverse drug reaction, there are approximately 222 that are not adverse drug reactions. The application of a sampling technique before using cost-sensitive learning offered the best result. On the test set, the f-measure was 0.121 for the minority class and 0.996 for the majority class.

摘要

这项工作专注于处理不良反应提取中的类别不平衡问题。药物不良反应在电子健康记录中较为罕见,但必须记录下来。文本挖掘技术可以帮助从文本中检索这类有价值的信息。使用不同的采样方法、代价敏感学习、集成学习和单类分类来处理类别不平衡问题,并使用随机森林分类器。不良反应提取模型是从一个数据集推断出来的,该数据集包含具有 1:222 不平衡比例的真实电子健康记录,这意味着对于每个药物-疾病对是不良反应的情况,大约有 222 个不是不良反应。在使用代价敏感学习之前应用采样技术提供了最佳结果。在测试集上,少数类别的 F1 测度为 0.121,多数类别的 F1 测度为 0.996。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验