Krithara Anastasia, Mork James G, Nentidis Anastasios, Paliouras Georgios
Institute of Informatics and Telecommunications, National Center for Scientific Research "Demokritos", Athens, Greece.
National Library of Medicine, Bethesda, MD, United States.
Front Res Metr Anal. 2023 Sep 29;8:1250930. doi: 10.3389/frma.2023.1250930. eCollection 2023.
Biomedical experts are facing challenges in keeping up with the vast amount of biomedical knowledge published daily. With millions of citations added to databases like MEDLINE/PubMed each year, efficiently accessing relevant information becomes crucial. Traditional term-based searches may lead to irrelevant or missed documents due to homonyms, synonyms, abbreviations, or term mismatch. To address this, semantic search approaches employing predefined concepts with associated synonyms and relations have been used to expand query terms and improve information retrieval. The National Library of Medicine (NLM) plays a significant role in this area, indexing citations in the MEDLINE database with topic descriptors from the Medical Subject Headings (MeSH) thesaurus, enabling advanced semantic search strategies to retrieve relevant citations, despite synonymy, and polysemy of biomedical terms. Over time, advancements in semantic indexing have been made, with Machine Learning facilitating the transition from manual to automatic semantic indexing in the biomedical literature. The paper highlights the journey of this transition, starting with manual semantic indexing and the initial efforts toward automatic indexing. The BioASQ challenge has served as a catalyst in revolutionizing the domain of semantic indexing, further pushing the boundaries of efficient knowledge retrieval in the biomedical field.
生物医学专家在跟上每日发表的大量生物医学知识方面面临挑战。每年有数以百万计的参考文献被添加到诸如MEDLINE/PubMed等数据库中,因此高效获取相关信息变得至关重要。由于存在同音异义词、同义词、缩写或术语不匹配的情况,传统的基于术语的搜索可能会导致检索到不相关或遗漏的文献。为了解决这个问题,采用带有相关同义词和关系的预定义概念的语义搜索方法已被用于扩展查询词并改善信息检索。美国国立医学图书馆(NLM)在这一领域发挥着重要作用,它使用医学主题词表(MeSH)中的主题描述符对MEDLINE数据库中的参考文献进行索引,从而能够采用先进的语义搜索策略来检索相关参考文献,尽管生物医学术语存在同义词和一词多义的情况。随着时间的推移,语义索引取得了进展,机器学习推动了生物医学文献从手动语义索引向自动语义索引的转变。本文重点介绍了这一转变的历程,从手动语义索引以及早期的自动索引努力开始。BioASQ挑战赛成为了语义索引领域变革的催化剂,进一步拓展了生物医学领域高效知识检索的边界。