Rabhi Sara, Jakubowicz Jérémie, Metzger Marie-Helene
Telecom SudParis, Institut Mines-Telecom, Paris, Île-de-France, France.
INSERM U1018, Villejuif, France.
Methods Inf Med. 2019 Jun;58(1):31-41. doi: 10.1055/s-0039-1677692. Epub 2019 Mar 15.
The objective of this article was to compare the performances of health care-associated infection (HAI) detection between deep learning and conventional machine learning (ML) methods in French medical reports.
The corpus consisted in different types of medical reports (discharge summaries, surgery reports, consultation reports, etc.). A total of 1,531 medical text documents were extracted and deidentified in three French university hospitals. Each of them was labeled as presence (1) or absence (0) of HAI. We started by normalizing the records using a list of preprocessing techniques. We calculated an overall performance metric, the F1 Score, to compare a deep learning method (convolutional neural network [CNN]) with the most popular conventional ML models (Bernoulli and multi-naïve Bayes, k-nearest neighbors, logistic regression, random forests, extra-trees, gradient boosting, support vector machines). We applied the hyperparameter Bayesian optimization for each model based on its HAI identification performances. We included the set of text representation as an additional hyperparameter for each model, using four different text representations (bag of words, term frequency-inverse document frequency, word2vec, and Glove).
CNN outperforms all other conventional ML algorithms for HAI classification. The best F1 Score of 97.7% ± 3.6% and best area under the curve score of 99.8% ± 0.41% were achieved when CNN was directly applied to the processed clinical notes without a pretrained word2vec embedding. Through receiver operating characteristic curve analysis, we could achieve a good balance between false notifications (with a specificity equal to 0.937) and system detection capability (with a sensitivity equal to 0.962) using the Youden's index reference.
The main drawback of CNNs is their opacity. To address this issue, we investigated CNN inner layers' activation values to visualize the most meaningful phrases in a document. This method could be used to build a phrase-based medical assistant algorithm to help the infection control practitioner to select relevant medical records. Our study demonstrated that deep learning approach outperforms other classification learning algorithms for automatically identifying HAIs in medical reports.
本文旨在比较深度学习和传统机器学习(ML)方法在法语医学报告中检测医疗保健相关感染(HAI)的性能。
语料库由不同类型的医学报告(出院小结、手术报告、会诊报告等)组成。在三家法国大学医院共提取并去识别了1531份医学文本文件。每份文件都被标记为存在(1)或不存在(0)HAI。我们首先使用一系列预处理技术对记录进行规范化。我们计算了一个总体性能指标F1分数,以比较一种深度学习方法(卷积神经网络[CNN])与最流行的传统ML模型(伯努利和多项式朴素贝叶斯、k近邻、逻辑回归、随机森林、极端随机树、梯度提升、支持向量机)。我们基于每个模型的HAI识别性能对其进行超参数贝叶斯优化。我们将文本表示集作为每个模型的一个额外超参数,使用四种不同的文本表示(词袋模型、词频-逆文档频率、word2vec和GloVe)。
在HAI分类方面,CNN的性能优于所有其他传统ML算法。当直接将CNN应用于处理后的临床记录而不使用预训练的word2vec嵌入时,获得了最佳F1分数97.7%±3.6%和最佳曲线下面积分数99.8%±0.41%。通过受试者工作特征曲线分析,使用约登指数参考,我们可以在错误通知(特异性等于0.937)和系统检测能力(敏感性等于0.962)之间实现良好的平衡。
CNN的主要缺点是其不透明性。为了解决这个问题,我们研究了CNN内层的激活值,以可视化文档中最有意义的短语。这种方法可用于构建基于短语的医疗辅助算法,以帮助感染控制从业者选择相关的医疗记录。我们的研究表明,深度学习方法在自动识别医学报告中的HAIs方面优于其他分类学习算法。