Suppr超能文献

使用统计文本分类对临床事件报告进行自动分类。

Automated categorisation of clinical incident reports using statistical text classification.

作者信息

Ong Mei-Sing, Magrabi Farah, Coiera Enrico

机构信息

Centre for Health Informatics, University of New South Wales, Sydney 2052, Australia.

出版信息

Qual Saf Health Care. 2010 Dec;19(6):e55. doi: 10.1136/qshc.2009.036657. Epub 2010 Aug 19.

Abstract

OBJECTIVES

To explore the feasibility of using statistical text classification techniques to automatically categorise clinical incident reports.

METHODS

Statistical text classifiers based on Naïve Bayes and Support Vector Machine algorithms were trained and tested on incident reports submitted by public hospitals to identify two classes of clinical incidents: inadequate clinical handover and incorrect patient identification. Each classifier was trained on 600 reports (300 positives, 300 negatives), and tested on 372 reports (248 positives, 124 negatives). The results were evaluated using standard measures of accuracy, precision, recall, F-measure and area under curve (AUC) of receiver operating characteristics (ROC). Classifier learning rates were also evaluated, using classifier accuracy against training set size.

RESULTS

All classifiers performed well in categorising clinical handover and patient identification incidents. Naïve Bayes attained the best performance on handover incidents, correctly identifying 86.29% of reporter-classified incidents (precision = 0.84, recall = .90, F-measure = 0.87, AUC = 0.93) and 91.53% of expert-classified incidents (precision = 0.87, recall = 0.98, F-measure = 0.92, AUC = 0.97). For patient identification incidents, the best results were obtained when Support Vector Machine with radial-basis function kernel was used to classify reporter-classified reports (accuracy = 97.98%, precision = 0.98, recall = 0.98, F-measure = 0.98, AUC = 1.00); and when Naïve Bayes was used on expert-classified reports (accuracy = 95.97%, precision = 0.95, recall = 0.98, F-measure = 0.96, AUC = 0.99). A relatively small training set was found to be adequate, with most classifiers achieving an accuracy above 80% when the training set size was as small as 100 samples.

CONCLUSIONS

This study demonstrates the feasibility of using text classification techniques to automatically categorise clinical incident reports.

摘要

目的

探讨使用统计文本分类技术对临床事件报告进行自动分类的可行性。

方法

基于朴素贝叶斯和支持向量机算法的统计文本分类器在公立医院提交的事件报告上进行训练和测试,以识别两类临床事件:临床交接不充分和患者身份识别错误。每个分类器在600份报告(300份阳性,300份阴性)上进行训练,并在372份报告(248份阳性,124份阴性)上进行测试。使用标准的准确性、精确率、召回率、F值和接收者操作特征曲线下面积(AUC)等指标对结果进行评估。还使用分类器准确性与训练集大小的关系来评估分类器的学习率。

结果

所有分类器在对临床交接和患者身份识别事件进行分类方面表现良好。朴素贝叶斯在交接事件上表现最佳,正确识别了报告者分类事件的86.29%(精确率 = 0.84,召回率 = 0.90,F值 = 0.87,AUC = 0.93)和专家分类事件的91.53%(精确率 = 0.87,召回率 = 0.98,F值 = 0.92,AUC = 0.97)。对于患者身份识别事件,当使用带径向基函数核的支持向量机对报告者分类的报告进行分类时,获得了最佳结果(准确率 = 97.98%,精确率 = 0.98,召回率 = 0.98,F值 = 0.98,AUC = 1.00);而当对专家分类的报告使用朴素贝叶斯时(准确率 = 95.97%,精确率 = 0.95,召回率 = 0.98,F值 = 0.96,AUC = 0.99)。发现相对较小的训练集就足够了,当训练集大小小至100个样本时,大多数分类器的准确率都能达到80%以上。

结论

本研究证明了使用文本分类技术对临床事件报告进行自动分类的可行性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验