Suppr超能文献

比较ARRS GoldMiner搜索引擎与临床PACS/RIS中的图像搜索行为。

Comparing image search behaviour in the ARRS GoldMiner search engine and a clinical PACS/RIS.

作者信息

De-Arteaga Maria, Eggel Ivan, Do Bao, Rubin Daniel, Kahn Charles E, Müller Henning

机构信息

Carnegie Mellon University, Pittsburgh, PA, USA.

University of Applied Sciences Western Switzerland (HES-SO), Sierre, Switzerland.

出版信息

J Biomed Inform. 2015 Aug;56:57-64. doi: 10.1016/j.jbi.2015.04.013. Epub 2015 May 19.

Abstract

Information search has changed the way we manage knowledge and the ubiquity of information access has made search a frequent activity, whether via Internet search engines or increasingly via mobile devices. Medical information search is in this respect no different and much research has been devoted to analyzing the way in which physicians aim to access information. Medical image search is a much smaller domain but has gained much attention as it has different characteristics than search for text documents. While web search log files have been analysed many times to better understand user behaviour, the log files of hospital internal systems for search in a PACS/RIS (Picture Archival and Communication System, Radiology Information System) have rarely been analysed. Such a comparison between a hospital PACS/RIS search and a web system for searching images of the biomedical literature is the goal of this paper. Objectives are to identify similarities and differences in search behaviour of the two systems, which could then be used to optimize existing systems and build new search engines. Log files of the ARRS GoldMiner medical image search engine (freely accessible on the Internet) containing 222,005 queries, and log files of Stanford's internal PACS/RIS search called radTF containing 18,068 queries were analysed. Each query was preprocessed and all query terms were mapped to the RadLex (Radiology Lexicon) terminology, a comprehensive lexicon of radiology terms created and maintained by the Radiological Society of North America, so the semantic content in the queries and the links between terms could be analysed, and synonyms for the same concept could be detected. RadLex was mainly created for the use in radiology reports, to aid structured reporting and the preparation of educational material (Lanlotz, 2006) [1]. In standard medical vocabularies such as MeSH (Medical Subject Headings) and UMLS (Unified Medical Language System) specific terms of radiology are often underrepresented, therefore RadLex was considered to be the best option for this task. The results show a surprising similarity between the usage behaviour in the two systems, but several subtle differences can also be noted. The average number of terms per query is 2.21 for GoldMiner and 2.07 for radTF, the used axes of RadLex (anatomy, pathology, findings, …) have almost the same distribution with clinical findings being the most frequent and the anatomical entity the second; also, combinations of RadLex axes are extremely similar between the two systems. Differences include a longer length of the sessions in radTF than in GoldMiner (3.4 and 1.9 queries per session on average). Several frequent search terms overlap but some strong differences exist in the details. In radTF the term "normal" is frequent, whereas in GoldMiner it is not. This makes intuitive sense, as in the literature normal cases are rarely described whereas in clinical work the comparison with normal cases is often a first step. The general similarity in many points is likely due to the fact that users of the two systems are influenced by their daily behaviour in using standard web search engines and follow this behaviour in their professional search. This means that many results and insights gained from standard web search can likely be transferred to more specialized search systems. Still, specialized log files can be used to find out more on reformulations and detailed strategies of users to find the right content.

摘要

信息搜索改变了我们管理知识的方式,信息获取的普遍性使搜索成为一项频繁的活动,无论是通过互联网搜索引擎,还是越来越多地通过移动设备。在这方面,医学信息搜索并无不同,许多研究致力于分析医生获取信息的方式。医学图像搜索是一个小得多的领域,但因其具有与文本文件搜索不同的特点而备受关注。虽然为了更好地理解用户行为,网页搜索日志文件已被多次分析,但医院内部系统中用于在PACS/RIS(图像存档与通信系统、放射信息系统)中进行搜索的日志文件却很少被分析。本文的目标是对医院PACS/RIS搜索与用于搜索生物医学文献图像的网络系统进行这样的比较。目的是识别这两个系统搜索行为的异同,进而用于优化现有系统并构建新的搜索引擎。分析了ARRS GoldMiner医学图像搜索引擎(可在互联网上免费访问)包含222,005条查询的日志文件,以及斯坦福大学内部名为radTF的PACS/RIS搜索包含18,068条查询的日志文件。对每个查询进行预处理,并将所有查询词映射到RadLex(放射学词汇)术语,这是由北美放射学会创建和维护的放射学综合词汇,以便能够分析查询中的语义内容以及词之间的联系,并检测同一概念的同义词。RadLex主要用于放射学报告,以辅助结构化报告和教育材料的编写(Lanlotz,2006年)[1]。在诸如MeSH(医学主题词表)和UMLS(统一医学语言系统)等标准医学词汇中,放射学的特定术语往往代表性不足,因此RadLex被认为是完成这项任务的最佳选择。结果显示两个系统的使用行为存在惊人的相似性,但也可以注意到一些细微的差异。GoldMiner每个查询的平均词数为2.21,radTF为2.07,RadLex使用的轴(解剖学、病理学、发现等)分布几乎相同,临床发现最为常见,解剖实体其次;此外,两个系统之间RadLex轴的组合极其相似。差异包括radTF中的会话长度比GoldMiner中的长(平均每个会话分别为3.4条和1.9条查询)。一些常用搜索词重叠,但在细节上存在一些显著差异。在radTF中,“正常”一词很常见,而在GoldMiner中则不然。这在直观上是有道理的,因为在文献中很少描述正常病例,而在临床工作中与正常病例进行比较往往是第一步。许多方面的总体相似性可能是由于这两个系统的用户受到他们日常使用标准网络搜索引擎行为的影响,并在专业搜索中遵循这种行为。这意味着从标准网络搜索中获得的许多结果和见解可能可以转移到更专业的搜索系统中。尽管如此,专门的日志文件可用于进一步了解用户的重新表述和详细策略,以找到正确的内容。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验