Institute of Medical Informatics, University of Münster, Münster, Germany.
Department of Computer Science, University of Münster, Münster, Germany.
Stud Health Technol Inform. 2024 Aug 22;316:1694-1698. doi: 10.3233/SHTI240749.
In many healthcare facilities, the prescription of drugs is done only in a semi-structured manner, using free-text fields where various information is often mixed. Therefore, automatic processing, especially for secondary use such as research purposes, is often challenging. This paper compares various approaches that identify and classify the various parts of these free-text fields in German language, namely simple Levenshtein-based, rule-based and CRF (conditional random field)-based approaches. Our results show that a F1-score >90% can be achieved with both the rule-based and the CRF-based approach, with the CRF-based approach even reaching nearly 95%.
在许多医疗机构中,药物的处方仅以半结构化的方式开具,使用自由文本字段,其中经常混合各种信息。因此,自动处理,特别是对于研究等二次使用,通常具有挑战性。本文比较了各种方法,这些方法用于识别和分类德语自由文本字段的各个部分,即基于简单的 Levenshtein、基于规则和基于 CRF(条件随机场)的方法。我们的结果表明,基于规则和基于 CRF 的方法都可以达到 F1 分数>90%,基于 CRF 的方法甚至可以达到近 95%。