Dudchenko Aleksei, Dudchenko Polina, Ganzinger Matthias, Kopanitsa Georgy
National Research Tomsk Polytechnic University, Tomsk, Russia.
Institute of Medical Biometry and Informatics, Heidelberg University, Heidelberg, Germany.
Stud Health Technol Inform. 2019;261:62-67.
Despite using electronic medical records, free narrative text is still widely used for medical records. Such text cannot be analyzed by statistical tools and be proceed by decision support systems. To make data from texts available for such tasks a supervised machine learning algorithms might be successfully applied. In this work, we develop and compare a prototype of a medical data extraction system based on different artificial neuron networks architectures to process free medical texts in Russian language. The best F-score (0.9763) achieved on a combination of CNN prediction model and large pre-trained word2vec model. The very close result (0.9741) has shown by the MLP model with the same embedding.
尽管使用了电子病历,但自由叙述文本在医疗记录中仍被广泛使用。此类文本无法通过统计工具进行分析,也无法由决策支持系统进行处理。为了使文本数据可用于此类任务,监督式机器学习算法可能会成功应用。在这项工作中,我们开发并比较了一个基于不同人工神经网络架构的医疗数据提取系统原型,以处理俄语的自由医疗文本。在CNN预测模型和大型预训练词向量模型的组合上取得了最佳F值(0.9763)。具有相同嵌入的MLP模型也显示出非常接近的结果(0.9741)。