Division of Medical Information Sciences, University Hospitals of Geneva.
Department of Radiology and Medical Informatics, University of Geneva, Switzerland.
Stud Health Technol Inform. 2022 May 25;294:874-875. doi: 10.3233/SHTI220613.
Many medical narratives are read by care professionals in their preferred language. These documents can be produced by organizations, authorities or national publishers. However, they are often hardly findable using the usual query engines based on English such as PubMed. This work explores the possibility to automatically categorize medical documents in French following an automatic Natural Language Processing pipeline. The pipeline is used to compare the performance of 6 different machine learning and deep neural network approaches on a large dataset of peer-reviewed weekly published Swiss medical journal in French covering major topics in medicine over the last 15 years. An accuracy of 96% was achieved for 5-topic classification and 81% for 20-topic classification.
许多医学叙事都是由护理专业人员用其偏好的语言阅读的。这些文档可以由组织、当局或国家出版商制作。然而,使用基于英语的常用查询引擎(如 PubMed)通常很难找到这些文档。这项工作探索了使用自动自然语言处理管道自动对法语医学文档进行分类的可能性。该管道用于在一个大型数据集上比较 6 种不同的机器学习和深度神经网络方法的性能,该数据集是每周出版的瑞士法语医学期刊的同行评审文章,涵盖了过去 15 年医学的主要主题。在 5 个主题分类中达到了 96%的准确率,在 20 个主题分类中达到了 81%的准确率。