Soni Sarvesh, Gudala Meghana, Wang Daisy Zhe, Roberts Kirk
School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX.
Department of Computer & Information Science & Engineering, University of Florida, Gainesville, FL.
AMIA Annu Symp Proc. 2020 Mar 4;2019:1207-1215. eCollection 2019.
This paper describes a novel technique for annotating logical forms and answers for clinical questions by utilizing Fast Healthcare Interoperability Resources (FHIR). Such annotations are widely used in building the semantic parsing models (which aim at understanding the precise meaning of natural language questions by converting them to machine-understandable logical forms). These systems focus on reducing the time it takes for a user to get to information present in electronic health records (EHRs). Directly annotating questions with logical forms is a challenging task and involves a time-consuming step of concept normalization annotation. We aim to automate this step using the normalized codes present in a FHIR resource. Using the proposed approach, two annotators curated an annotated dataset of 1000 questions in less than 1 week. To assess the quality of these annotations, we trained a semantic parsing model which achieved an accuracy of 94.2% on this corpus.
本文描述了一种利用快速医疗保健互操作性资源(FHIR)为临床问题的逻辑形式和答案进行标注的新技术。此类标注在构建语义解析模型(旨在通过将自然语言问题转换为机器可理解的逻辑形式来理解其精确含义)中被广泛使用。这些系统致力于减少用户获取电子健康记录(EHR)中信息所需的时间。直接用逻辑形式标注问题是一项具有挑战性的任务,并且涉及概念归一化标注这一耗时步骤。我们旨在使用FHIR资源中存在的归一化代码来自动化这一步骤。使用所提出的方法,两名标注人员在不到1周的时间内精心整理了一个包含1000个问题的标注数据集。为了评估这些标注的质量,我们训练了一个语义解析模型,该模型在这个语料库上达到了94.2%的准确率。