Suppr超能文献

为电子病历搜索引擎开发基于语义的查询推荐:查询日志分析与设计启示

Developing a Semantically Based Query Recommendation for an Electronic Medical Record Search Engine: Query Log Analysis and Design Implications.

作者信息

Wu Danny T Y, Hanauer David, Murdock Paul, Vydiswaran V G Vinod, Mei Qiaozhu, Zheng Kai

机构信息

Department of Biomedical Informatics, University of Cincinnati College of Medicine, Cincinnati, OH, United States.

School of Information, University of Michigan, Ann Arbor, MI, United States.

出版信息

JMIR Form Res. 2023 Sep 15;7:e45376. doi: 10.2196/45376.

Abstract

BACKGROUND

An effective and scalable information retrieval (IR) system plays a crucial role in enabling clinicians and researchers to harness the valuable information present in electronic health records. In a previous study, we developed a prototype medical IR system, which incorporated a semantically based query recommendation (SBQR) feature. The system was evaluated empirically and demonstrated high perceived performance by end users. To delve deeper into the factors contributing to this perceived performance, we conducted a follow-up study using query log analysis.

OBJECTIVE

One of the primary challenges faced in IR is that users often have limited knowledge regarding their specific information needs. Consequently, an IR system, particularly its user interface, needs to be thoughtfully designed to assist users through the iterative process of refining their queries as they encounter relevant documents during their search. To address these challenges, we incorporated "query recommendation" into our Electronic Medical Record Search Engine (EMERSE), drawing inspiration from the success of similar features in modern IR systems for general purposes.

METHODS

The query log data analyzed in this study were collected during our previous experimental study, where we developed EMERSE with the SBQR feature. We implemented a logging mechanism to capture user query behaviors and the output of the IR system (retrieved documents). In this analysis, we compared the initial query entered by users with the query formulated with the assistance of the SBQR. By examining the results of this comparison, we could examine whether the use of SBQR helped in constructing improved queries that differed from the original ones.

RESULTS

Our findings revealed that the first query entered without SBQR and the final query with SBQR assistance were highly similar (Jaccard similarity coefficient=0.77). This suggests that the perceived positive performance of the system was primarily attributed to the automatic query expansion facilitated by the SBQR rather than users manually manipulating their queries. In addition, through entropy analysis, we observed that search results converged in scenarios of moderate difficulty, and the degree of convergence correlated strongly with the perceived system performance.

CONCLUSIONS

The study demonstrated the potential contribution of the SBQR in shaping participants' positive perceptions of system performance, contingent upon the difficulty of the search scenario. Medical IR systems should therefore consider incorporating an SBQR as a user-controlled option or a semiautomated feature. Future work entails redesigning the experiment in a more controlled manner and conducting multisite studies to demonstrate the effectiveness of EMERSE with SBQR for patient cohort identification. By further exploring and validating these findings, we can enhance the usability and functionality of medical IR systems in real-world settings.

摘要

背景

一个有效且可扩展的信息检索(IR)系统在使临床医生和研究人员能够利用电子健康记录中的宝贵信息方面发挥着至关重要的作用。在之前的一项研究中,我们开发了一个原型医学IR系统,该系统纳入了基于语义的查询推荐(SBQR)功能。该系统经过实证评估,并得到了终端用户对其高性能的认可。为了更深入地探究促成这种感知性能的因素,我们使用查询日志分析进行了一项后续研究。

目的

IR面临的主要挑战之一是用户通常对其特定信息需求的了解有限。因此,IR系统,尤其是其用户界面,需要经过精心设计,以帮助用户在搜索过程中遇到相关文档时,通过迭代过程优化他们的查询。为了应对这些挑战,我们从现代通用IR系统中类似功能的成功经验中汲取灵感,将“查询推荐”纳入我们的电子病历搜索引擎(EMERSE)。

方法

本研究中分析的查询日志数据是在我们之前的实验研究中收集的,在该实验中我们开发了具有SBQR功能的EMERSE。我们实施了一种日志记录机制,以捕获用户查询行为和IR系统的输出(检索到的文档)。在本分析中,我们将用户输入的初始查询与在SBQR协助下制定的查询进行了比较。通过检查此比较的结果,我们可以考察SBQR的使用是否有助于构建与原始查询不同的改进查询。

结果

我们的研究结果表明,在没有SBQR的情况下输入的第一个查询与在SBQR协助下的最终查询高度相似(杰卡德相似系数 = 0.77)。这表明系统的良好感知性能主要归因于SBQR促成的自动查询扩展,而非用户手动操作查询。此外,通过熵分析,我们观察到在中等难度的场景中搜索结果会收敛,并且收敛程度与感知到的系统性能密切相关。

结论

该研究证明了SBQR在塑造参与者对系统性能的积极认知方面的潜在贡献,这取决于搜索场景的难度。因此,医学IR系统应考虑将SBQR作为用户可控选项或半自动功能纳入。未来的工作需要以更可控的方式重新设计实验,并进行多地点研究,以证明带有SBQR的EMERSE在识别患者队列方面的有效性。通过进一步探索和验证这些发现,我们可以提高医学IR系统在实际应用中的可用性和功能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9d2/10541636/c88c4aaaca94/formative_v7i1e45376_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验