• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从生物医学文献的手动语义索引到自动语义索引之路:十年历程。

The road from manual to automatic semantic indexing of biomedical literature: a 10 years journey.

作者信息

Krithara Anastasia, Mork James G, Nentidis Anastasios, Paliouras Georgios

机构信息

Institute of Informatics and Telecommunications, National Center for Scientific Research "Demokritos", Athens, Greece.

National Library of Medicine, Bethesda, MD, United States.

出版信息

Front Res Metr Anal. 2023 Sep 29;8:1250930. doi: 10.3389/frma.2023.1250930. eCollection 2023.

DOI:10.3389/frma.2023.1250930
PMID:37841902
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10576528/
Abstract

Biomedical experts are facing challenges in keeping up with the vast amount of biomedical knowledge published daily. With millions of citations added to databases like MEDLINE/PubMed each year, efficiently accessing relevant information becomes crucial. Traditional term-based searches may lead to irrelevant or missed documents due to homonyms, synonyms, abbreviations, or term mismatch. To address this, semantic search approaches employing predefined concepts with associated synonyms and relations have been used to expand query terms and improve information retrieval. The National Library of Medicine (NLM) plays a significant role in this area, indexing citations in the MEDLINE database with topic descriptors from the Medical Subject Headings (MeSH) thesaurus, enabling advanced semantic search strategies to retrieve relevant citations, despite synonymy, and polysemy of biomedical terms. Over time, advancements in semantic indexing have been made, with Machine Learning facilitating the transition from manual to automatic semantic indexing in the biomedical literature. The paper highlights the journey of this transition, starting with manual semantic indexing and the initial efforts toward automatic indexing. The BioASQ challenge has served as a catalyst in revolutionizing the domain of semantic indexing, further pushing the boundaries of efficient knowledge retrieval in the biomedical field.

摘要

生物医学专家在跟上每日发表的大量生物医学知识方面面临挑战。每年有数以百万计的参考文献被添加到诸如MEDLINE/PubMed等数据库中,因此高效获取相关信息变得至关重要。由于存在同音异义词、同义词、缩写或术语不匹配的情况,传统的基于术语的搜索可能会导致检索到不相关或遗漏的文献。为了解决这个问题,采用带有相关同义词和关系的预定义概念的语义搜索方法已被用于扩展查询词并改善信息检索。美国国立医学图书馆(NLM)在这一领域发挥着重要作用,它使用医学主题词表(MeSH)中的主题描述符对MEDLINE数据库中的参考文献进行索引,从而能够采用先进的语义搜索策略来检索相关参考文献,尽管生物医学术语存在同义词和一词多义的情况。随着时间的推移,语义索引取得了进展,机器学习推动了生物医学文献从手动语义索引向自动语义索引的转变。本文重点介绍了这一转变的历程,从手动语义索引以及早期的自动索引努力开始。BioASQ挑战赛成为了语义索引领域变革的催化剂,进一步拓展了生物医学领域高效知识检索的边界。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c356/10576528/3feca916121e/frma-08-1250930-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c356/10576528/927932c0e105/frma-08-1250930-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c356/10576528/8ccb8d6cfbef/frma-08-1250930-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c356/10576528/8ad8d6423981/frma-08-1250930-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c356/10576528/3e6cd761515c/frma-08-1250930-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c356/10576528/7e9295a12fb7/frma-08-1250930-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c356/10576528/3feca916121e/frma-08-1250930-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c356/10576528/927932c0e105/frma-08-1250930-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c356/10576528/8ccb8d6cfbef/frma-08-1250930-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c356/10576528/8ad8d6423981/frma-08-1250930-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c356/10576528/3e6cd761515c/frma-08-1250930-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c356/10576528/7e9295a12fb7/frma-08-1250930-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c356/10576528/3feca916121e/frma-08-1250930-g0006.jpg

相似文献

1
The road from manual to automatic semantic indexing of biomedical literature: a 10 years journey.从生物医学文献的手动语义索引到自动语义索引之路:十年历程。
Front Res Metr Anal. 2023 Sep 29;8:1250930. doi: 10.3389/frma.2023.1250930. eCollection 2023.
2
Identification of the Best Semantic Expansion to Query PubMed Through Automatic Performance Assessment of Four Search Strategies on All Medical Subject Heading Descriptors: Comparative Study.通过对所有医学主题词描述符的四种检索策略进行自动性能评估来确定查询PubMed的最佳语义扩展:比较研究
JMIR Med Inform. 2020 Jun 4;8(6):e12799. doi: 10.2196/12799.
3
An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition.BIOASQ大规模生物医学语义索引与问答竞赛概述。
BMC Bioinformatics. 2015 Apr 30;16:138. doi: 10.1186/s12859-015-0564-6.
4
Search and Graph Database Technologies for Biomedical Semantic Indexing: Experimental Analysis.用于生物医学语义索引的搜索与图形数据库技术:实验分析
JMIR Med Inform. 2017 Dec 1;5(4):e48. doi: 10.2196/medinform.7059.
5
MeSH Now: automatic MeSH indexing at PubMed scale via learning to rank.医学主题词表现状:通过学习排序实现PubMed规模的自动医学主题词表索引编制。
J Biomed Semantics. 2017 Apr 17;8(1):15. doi: 10.1186/s13326-017-0123-3.
6
MeSHProbeNet: a self-attentive probe net for MeSH indexing.MeSHProbeNet:一种用于 MeSH 索引的自注意探针网络。
Bioinformatics. 2019 Oct 1;35(19):3794-3802. doi: 10.1093/bioinformatics/btz142.
7
Reflective random indexing for semi-automatic indexing of the biomedical literature.基于反射随机索引的生物医学文献半自动索引方法。
J Biomed Inform. 2010 Oct;43(5):694-700. doi: 10.1016/j.jbi.2010.04.001. Epub 2010 Apr 9.
8
Besides precision & recall: exploring alternative approaches to evaluating an automatic indexing tool for MEDLINE.除精确率与召回率外:探索评估MEDLINE自动索引工具的其他方法。
AMIA Annu Symp Proc. 2006;2006:589-93.
9
Biomedical semantic indexing by deep neural network with multi-task learning.基于多任务学习的深度神经网络生物医学语义索引
BMC Bioinformatics. 2018 Dec 21;19(Suppl 20):502. doi: 10.1186/s12859-018-2534-2.
10
Multi-probe attention neural network for COVID-19 semantic indexing.多探针注意力神经网络用于 COVID-19 语义索引。
BMC Bioinformatics. 2022 Jun 29;23(1):259. doi: 10.1186/s12859-022-04803-x.

引用本文的文献

1
Enhancing Automatic PT Tagging for MEDLINE Citations Using Transformer-Based Models.使用基于Transformer的模型增强MEDLINE引文的自动PT标注
ArXiv. 2025 Jun 3:arXiv:2506.03321v1.
2
High-precision information retrieval for rapid clinical guideline updates.用于快速更新临床指南的高精度信息检索。
NPJ Digit Med. 2025 Apr 27;8(1):227. doi: 10.1038/s41746-025-01648-5.
3
Automatic detection and extraction of key resources from tables in biomedical papers.从生物医学论文表格中自动检测和提取关键资源

本文引用的文献

1
WeakMeSH: Leveraging provenance information for weakly supervised classification of biomedical articles with emerging MeSH descriptors.弱医学主题词表:利用出处信息对带有新兴医学主题词描述符的生物医学文章进行弱监督分类。
Artif Intell Med. 2023 Mar;137:102505. doi: 10.1016/j.artmed.2023.102505. Epub 2023 Jan 31.
2
BERTMeSH: deep contextual representation learning for large-scale high-performance MeSH indexing with full text.BERTMeSH:基于深度上下文表示学习的大规模高性能 MeSH 索引与全文检索
Bioinformatics. 2021 May 5;37(5):684-692. doi: 10.1093/bioinformatics/btaa837.
3
12 years on - Is the NLM medical text indexer still useful and relevant?
BioData Min. 2025 Mar 20;18(1):23. doi: 10.1186/s13040-025-00438-9.
4
Filtering failure: the impact of automated indexing in Medline on retrieval of human studies for knowledge synthesis.筛选失败:Medline 中的自动索引对人类研究知识综合检索的影响。
J Med Libr Assoc. 2025 Jan 14;113(1):58-64. doi: 10.5195/jmla.2025.1972.
5
Algorithmic indexing in MEDLINE frequently overlooks important concepts and may compromise literature search results.MEDLINE中的算法索引经常会忽略重要概念,可能会影响文献检索结果。
J Med Libr Assoc. 2025 Jan 14;113(1):39-48. doi: 10.5195/jmla.2025.1936.
十二年过去了——国立医学图书馆医学文本索引工具仍然有用吗?它还适用吗?
J Biomed Semantics. 2017 Feb 23;8(1):8. doi: 10.1186/s13326-017-0113-5.
4
An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition.BIOASQ大规模生物医学语义索引与问答竞赛概述。
BMC Bioinformatics. 2015 Apr 30;16:138. doi: 10.1186/s12859-015-0564-6.
5
The great contribution: Index Medicus, Index-Catalogue, and IndexCat.巨大贡献:《医学索引》《索引目录》和《索引编目》。
J Med Libr Assoc. 2009 Apr;97(2):108-13. doi: 10.3163/1536-5050.97.2.007.
6
EBIMed--text crunching to gather facts for proteins from Medline.EBIMed——通过文本处理从医学在线数据库中收集蛋白质相关事实。
Bioinformatics. 2007 Jan 15;23(2):e237-44. doi: 10.1093/bioinformatics/btl302.
7
SNOMED-CT: The advanced terminology and coding system for eHealth.SNOMED-CT:电子健康的先进术语和编码系统。
Stud Health Technol Inform. 2006;121:279-90.
8
Evolving research trends in bioinformatics.生物信息学中不断发展的研究趋势。
Brief Bioinform. 2007 Mar;8(2):88-95. doi: 10.1093/bib/bbl035. Epub 2006 Oct 31.
9
HubMed: a web-based biomedical literature search interface.HubMed:一个基于网络的生物医学文献检索界面。
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W745-7. doi: 10.1093/nar/gkl037.
10
The Unified Medical Language System (UMLS): integrating biomedical terminology.统一医学语言系统(UMLS):整合生物医学术语。
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70. doi: 10.1093/nar/gkh061.