Suppr超能文献

将 Essie 分词和规范化移植到 Solr 中的效果。

Effects of Porting Essie Tokenization and Normalization to Solr.

机构信息

Lister Hill National Center for Biomedical Communications, U.S. National Library of Medicine, Bethesda, MD, USA.

出版信息

AMIA Annu Symp Proc. 2024 Jan 11;2023:369-378. eCollection 2023.

Abstract

Search for information is now an integral part of healthcare. Searches are enabled by search engines whose objective is to efficiently retrieve the relevant information for the user query. When it comes to retrieving biomedical text and literature, Essie search engine developed at the National Library of Medicine (NLM) performs exceptionally well. However, Essie is a software system developed for NLM that has ceased development and support. On the other hand, Solr is a popular opensource enterprise search engine used by many of the world's largest internet sites, offering continuous developments and improvements along with the state-of-the-art features. In this paper, we present our approach to porting the key features of Essie and developing custom components to be used in Solr. We demonstrate the effectiveness of the added components on three benchmark biomedical datasets. The custom components may aid the community in improving search methods for biomedical text retrieval.

摘要

搜索信息现在是医疗保健的一个组成部分。搜索是由搜索引擎实现的,其目标是为用户查询有效地检索相关信息。在检索生物医学文本和文献方面,美国国家医学图书馆(NLM)开发的 Essie 搜索引擎表现得非常出色。然而,Essie 是一个专为 NLM 开发的软件系统,已经停止了开发和支持。另一方面,Solr 是一个流行的开源企业搜索引擎,被许多世界上最大的互联网网站使用,提供持续的发展和改进,以及最先进的功能。在本文中,我们介绍了将 Essie 的关键特性移植到 Solr 并开发自定义组件的方法。我们在三个基准生物医学数据集上展示了添加组件的有效性。这些自定义组件可以帮助社区改进生物医学文本检索的搜索方法。

相似文献

1
Effects of Porting Essie Tokenization and Normalization to Solr.
AMIA Annu Symp Proc. 2024 Jan 11;2023:369-378. eCollection 2023.
2
Essie: a concept-based search engine for structured biomedical text.
J Am Med Inform Assoc. 2007 May-Jun;14(3):253-63. doi: 10.1197/jamia.M2233. Epub 2007 Feb 28.
3
The NLM Gateway: a metasearch engine for disparate resources.
Stud Health Technol Inform. 2004;107(Pt 1):52-6.
4
Searching for cancer information on the internet: analyzing natural language search queries.
J Med Internet Res. 2003 Dec 11;5(4):e31. doi: 10.2196/jmir.5.4.e31.
5
Evaluation of Clinical Text Segmentation to Facilitate Cohort Retrieval.
AMIA Annu Symp Proc. 2018 Apr 16;2017:660-669. eCollection 2017.
6
G-Bean: an ontology-graph based web tool for biomedical literature retrieval.
BMC Bioinformatics. 2014;15 Suppl 12(Suppl 12):S1. doi: 10.1186/1471-2105-15-S12-S1. Epub 2014 Nov 6.
7
A2A: a platform for research in biomedical literature search.
BMC Bioinformatics. 2020 Dec 21;21(Suppl 19):572. doi: 10.1186/s12859-020-03894-8.
8
BIOMedical Search Engine Framework: Lightweight and customized implementation of domain-specific biomedical search engines.
Comput Methods Programs Biomed. 2016 Jul;131:63-77. doi: 10.1016/j.cmpb.2016.03.030. Epub 2016 Apr 8.

本文引用的文献

1
LitSense: making sense of biomedical literature at sentence level.
Nucleic Acids Res. 2019 Jul 2;47(W1):W594-W599. doi: 10.1093/nar/gkz289.
2
BEST: Next-Generation Biomedical Entity Search Tool for Knowledge Discovery from Biomedical Literature.
PLoS One. 2016 Oct 19;11(10):e0164680. doi: 10.1371/journal.pone.0164680. eCollection 2016.
4
Essie: a concept-based search engine for structured biomedical text.
J Am Med Inform Assoc. 2007 May-Jun;14(3):253-63. doi: 10.1197/jamia.M2233. Epub 2007 Feb 28.
6
Text mining and its potential applications in systems biology.
Trends Biotechnol. 2006 Dec;24(12):571-9. doi: 10.1016/j.tibtech.2006.10.002. Epub 2006 Oct 12.
7
The TREC 2004 genomics track categorization task: classifying full text biomedical documents.
J Biomed Discov Collab. 2006 Mar 14;1:4. doi: 10.1186/1747-5333-1-4.
8
Term identification in the biomedical literature.
J Biomed Inform. 2004 Dec;37(6):512-26. doi: 10.1016/j.jbi.2004.08.004.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验