Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL, United States.
School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States.
J Med Internet Res. 2024 Oct 30;26:e53636. doi: 10.2196/53636.
Question answering (QA) systems for patient-related data can assist both clinicians and patients. They can, for example, assist clinicians in decision-making and enable patients to have a better understanding of their medical history. Substantial amounts of patient data are stored in electronic health records (EHRs), making EHR QA an important research area. Because of the differences in data format and modality, this differs greatly from other medical QA tasks that use medical websites or scientific papers to retrieve answers, making it critical to research EHR QA.
This study aims to provide a methodological review of existing works on QA for EHRs. The objectives of this study were to identify the existing EHR QA datasets and analyze them, study the state-of-the-art methodologies used in this task, compare the different evaluation metrics used by these state-of-the-art models, and finally elicit the various challenges and the ongoing issues in EHR QA.
We searched for articles from January 1, 2005, to September 30, 2023, in 4 digital sources, including Google Scholar, ACL Anthology, ACM Digital Library, and PubMed, to collect relevant publications on EHR QA. Our systematic screening process followed PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. A total of 4111 papers were identified for our study, and after screening based on our inclusion criteria, we obtained 47 papers for further study. The selected studies were then classified into 2 non-mutually exclusive categories depending on their scope: "EHR QA datasets" and "EHR QA models."
A systematic screening process obtained 47 papers on EHR QA for final review. Out of the 47 papers, 53% (n=25) were about EHR QA datasets, and 79% (n=37) papers were about EHR QA models. It was observed that QA on EHRs is relatively new and unexplored. Most of the works are fairly recent. In addition, it was observed that emrQA is by far the most popular EHR QA dataset, both in terms of citations and usage in other papers. We have classified the EHR QA datasets based on their modality, and we have inferred that Medical Information Mart for Intensive Care (MIMIC-III) and the National Natural Language Processing Clinical Challenges datasets (ie, n2c2 datasets) are the most popular EHR databases and corpuses used in EHR QA. Furthermore, we identified the different models used in EHR QA along with the evaluation metrics used for these models.
EHR QA research faces multiple challenges, such as the limited availability of clinical annotations, concept normalization in EHR QA, and challenges faced in generating realistic EHR QA datasets. There are still many gaps in research that motivate further work. This study will assist future researchers in focusing on areas of EHR QA that have possible future research directions.
针对患者相关数据的问答 (QA) 系统可以帮助临床医生和患者。例如,它们可以帮助临床医生做出决策,并使患者更好地了解自己的病史。大量患者数据存储在电子健康记录 (EHR) 中,这使得 EHR QA 成为一个重要的研究领域。由于数据格式和模式的差异,这与使用医疗网站或科学论文检索答案的其他医疗 QA 任务有很大不同,因此研究 EHR QA 至关重要。
本研究旨在对现有的 EHR QA 工作进行方法学回顾。本研究的目的是确定现有的 EHR QA 数据集并对其进行分析,研究该任务中使用的最新方法,比较这些最新模型使用的不同评估指标,并最终引出 EHR QA 中的各种挑战和当前问题。
我们从 2005 年 1 月 1 日至 2023 年 9 月 30 日在四个数字来源(包括 Google Scholar、ACL 文集、ACM 数字图书馆和 PubMed)中搜索了文章,以收集有关 EHR QA 的相关出版物。我们的系统筛选过程遵循 PRISMA(系统评价和荟萃分析的首选报告项目)指南。我们总共确定了 4111 篇论文作为我们的研究对象,在根据纳入标准进行筛选后,我们获得了 47 篇论文进行进一步研究。然后,根据其范围将所选研究分为两个非互斥类别:“EHR QA 数据集”和“EHR QA 模型”。
通过系统筛选过程获得了 47 篇关于 EHR QA 的论文进行最终审查。在这 47 篇论文中,53%(n=25)是关于 EHR QA 数据集的,79%(n=37)的论文是关于 EHR QA 模型的。可以看出,EHR 上的 QA 相对较新且尚未得到充分探索。大多数作品都比较新。此外,还观察到 emrQA 是迄今为止最受欢迎的 EHR QA 数据集,无论是在引用次数还是在其他论文中的使用方面。我们根据其模态对 EHR QA 数据集进行了分类,并推断 MIMIC-III 和 National Natural Language Processing Clinical Challenges 数据集(即 n2c2 数据集)是使用最广泛的 EHR QA 数据库和语料库。此外,我们确定了 EHR QA 中使用的不同模型以及这些模型使用的评估指标。
EHR QA 研究面临着许多挑战,例如临床注释的有限可用性、EHR QA 中的概念规范化以及生成逼真的 EHR QA 数据集所面临的挑战。研究中仍然存在许多空白,这促使进一步的工作。本研究将帮助未来的研究人员专注于 EHR QA 中具有未来研究方向的领域。