Lister Hill National Center for Biomedical Communications, U.S. National Library of Medicine, Bethesda, MD, USA.
AMIA Annu Symp Proc. 2024 Jan 11;2023:369-378. eCollection 2023.
Search for information is now an integral part of healthcare. Searches are enabled by search engines whose objective is to efficiently retrieve the relevant information for the user query. When it comes to retrieving biomedical text and literature, Essie search engine developed at the National Library of Medicine (NLM) performs exceptionally well. However, Essie is a software system developed for NLM that has ceased development and support. On the other hand, Solr is a popular opensource enterprise search engine used by many of the world's largest internet sites, offering continuous developments and improvements along with the state-of-the-art features. In this paper, we present our approach to porting the key features of Essie and developing custom components to be used in Solr. We demonstrate the effectiveness of the added components on three benchmark biomedical datasets. The custom components may aid the community in improving search methods for biomedical text retrieval.
搜索信息现在是医疗保健的一个组成部分。搜索是由搜索引擎实现的,其目标是为用户查询有效地检索相关信息。在检索生物医学文本和文献方面,美国国家医学图书馆(NLM)开发的 Essie 搜索引擎表现得非常出色。然而,Essie 是一个专为 NLM 开发的软件系统,已经停止了开发和支持。另一方面,Solr 是一个流行的开源企业搜索引擎,被许多世界上最大的互联网网站使用,提供持续的发展和改进,以及最先进的功能。在本文中,我们介绍了将 Essie 的关键特性移植到 Solr 并开发自定义组件的方法。我们在三个基准生物医学数据集上展示了添加组件的有效性。这些自定义组件可以帮助社区改进生物医学文本检索的搜索方法。