Viani Natalia, Chiudinelli Lorenzo, Tasca Cristina, Zambelli Alberto, Bucalo Mauro, Ghirardi Arianna, Barbarini Nicola, Sfreddo Eleonora, Sacchi Lucia, Tondini Carlo, Bellazzi Riccardo
Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy.
ASST Papa Giovanni XXIII Hospital, Bergamo, Italy.
Stud Health Technol Inform. 2018;247:715-719.
Medical reports often contain a lot of relevant information in the form of free text. To reuse these unstructured texts for biomedical research, it is important to extract structured data from them. In this work, we adapted a previously developed information extraction system to the oncology domain, to process a set of anatomic pathology reports in the Italian language. The information extraction system relies on a domain ontology, which was adapted and refined in an iterative way. The final output was evaluated by a domain expert, with promising results.
医学报告通常包含大量以自由文本形式存在的相关信息。为了将这些非结构化文本重新用于生物医学研究,从其中提取结构化数据很重要。在这项工作中,我们将先前开发的信息提取系统应用于肿瘤学领域,以处理一组意大利语的解剖病理学报告。该信息提取系统依赖于一个领域本体,该本体以迭代方式进行了调整和完善。最终输出由一位领域专家进行了评估,结果很有前景。