• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于谓词的向量空间模型生物医学搜索引擎的开发与评价。

Development and evaluation of a biomedical search engine using a predicate-based vector space model.

机构信息

School of Information Technology, Middle Georgia State College, Macon, GA 31206, United States.

出版信息

J Biomed Inform. 2013 Oct;46(5):929-39. doi: 10.1016/j.jbi.2013.07.006. Epub 2013 Jul 25.

DOI:10.1016/j.jbi.2013.07.006
PMID:23892296
Abstract

Although biomedical information available in articles and patents is increasing exponentially, we continue to rely on the same information retrieval methods and use very few keywords to search millions of documents. We are developing a fundamentally different approach for finding much more precise and complete information with a single query using predicates instead of keywords for both query and document representation. Predicates are triples that are more complex datastructures than keywords and contain more structured information. To make optimal use of them, we developed a new predicate-based vector space model and query-document similarity function with adjusted tf-idf and boost function. Using a test bed of 107,367 PubMed abstracts, we evaluated the first essential function: retrieving information. Cancer researchers provided 20 realistic queries, for which the top 15 abstracts were retrieved using a predicate-based (new) and keyword-based (baseline) approach. Each abstract was evaluated, double-blind, by cancer researchers on a 0-5 point scale to calculate precision (0 versus higher) and relevance (0-5 score). Precision was significantly higher (p<.001) for the predicate-based (80%) than for the keyword-based (71%) approach. Relevance was almost doubled with the predicate-based approach-2.1 versus 1.6 without rank order adjustment (p<.001) and 1.34 versus 0.98 with rank order adjustment (p<.001) for predicate--versus keyword-based approach respectively. Predicates can support more precise searching than keywords, laying the foundation for rich and sophisticated information search.

摘要

虽然文章和专利中的生物医学信息呈指数级增长,但我们仍在继续依赖相同的信息检索方法,并使用很少的关键词搜索数百万份文档。我们正在开发一种截然不同的方法,通过使用谓词(而不是关键词)代替查询和文档表示中的关键词,用单个查询来查找更精确和完整的信息。谓词是比关键词更复杂的数据结构,包含更多结构化信息的三元组。为了充分利用它们,我们开发了一种新的基于谓词的向量空间模型和查询-文档相似性函数,调整了 tf-idf 和提升函数。我们使用了 107367 个 PubMed 摘要的测试平台来评估第一个基本功能:检索信息。癌症研究人员提供了 20 个实际查询,使用基于谓词的(新)和基于关键词的(基线)方法检索了前 15 个摘要。每个摘要都由癌症研究人员进行双盲评估,评分范围为 0-5 分,以计算精度(0 与更高)和相关性(0-5 分)。基于谓词的方法(80%)的精度明显高于基于关键词的方法(71%)(p<.001)。基于谓词的方法的相关性几乎提高了一倍-2.1 与 1.6(无排名调整时)(p<.001)和 1.34 与 0.98(有排名调整时)(p<.001),分别与关键词的方法。谓词可以支持比关键词更精确的搜索,为丰富和复杂的信息搜索奠定了基础。

相似文献

1
Development and evaluation of a biomedical search engine using a predicate-based vector space model.基于谓词的向量空间模型生物医学搜索引擎的开发与评价。
J Biomed Inform. 2013 Oct;46(5):929-39. doi: 10.1016/j.jbi.2013.07.006. Epub 2013 Jul 25.
2
How Does ChatGPT Use Source Information Compared With Google? A Text Network Analysis of Online Health Information.ChatGPT 与谷歌相比如何使用来源信息?在线健康信息的文本网络分析。
Clin Orthop Relat Res. 2024 Apr 1;482(4):578-588. doi: 10.1097/CORR.0000000000002995. Epub 2024 Mar 1.
3
Relemed: sentence-level search engine with relevance score for the MEDLINE database of biomedical articles.Relemed:用于生物医学文献MEDLINE数据库的具有相关性评分的句子级搜索引擎。
BMC Med Inform Decis Mak. 2007 Jan 10;7:1. doi: 10.1186/1472-6947-7-1.
4
Hybrid ontology for semantic information retrieval model using keyword matching indexing system.使用关键词匹配索引系统的语义信息检索模型的混合本体。
ScientificWorldJournal. 2015;2015:414910. doi: 10.1155/2015/414910. Epub 2015 Apr 1.
5
Essie: a concept-based search engine for structured biomedical text.Essie:一个用于结构化生物医学文本的基于概念的搜索引擎。
J Am Med Inform Assoc. 2007 May-Jun;14(3):253-63. doi: 10.1197/jamia.M2233. Epub 2007 Feb 28.
6
The LAILAPS search engine: a feature model for relevance ranking in life science databases.LAILAPS搜索引擎:一种用于生命科学数据库相关性排名的特征模型。
J Integr Bioinform. 2010 Mar 25;7(3):476. doi: 10.2390/biecoll-jib-2010-118.
7
IntentSearch: Capturing User Intention for One-Click Internet Image Search.意图搜索:实现一键式互联网图像搜索中的用户意图捕获。
IEEE Trans Pattern Anal Mach Intell. 2012 Jul;34(7):1342-53. doi: 10.1109/TPAMI.2011.242. Epub 2011 Dec 13.
8
Supporting inter-topic entity search for biomedical Linked Data based on heterogeneous relationships.基于异构关系的生物医学链接数据中跨主题实体搜索的支持。
Comput Biol Med. 2017 Aug 1;87:217-229. doi: 10.1016/j.compbiomed.2017.05.026. Epub 2017 May 31.
9
The LAILAPS search engine: relevance ranking in life science databases.LAILAPS搜索引擎:生命科学数据库中的相关性排名
J Integr Bioinform. 2010 Jan 15;7(2):110. doi: 10.2390/biecoll-jib-2010-110.
10
Log analysis to understand medical professionals' image searching behaviour.日志分析以了解医学专业人员的图像搜索行为。
Stud Health Technol Inform. 2012;180:1020-4.

引用本文的文献

1
Skull acoustic aberration correction in photoacoustic microscopy using a vector space similarity model: a proof-of-concept simulation study.使用向量空间相似性模型的光声显微镜中的颅骨声像差校正:概念验证模拟研究
Biomed Opt Express. 2020 Sep 14;11(10):5542-5556. doi: 10.1364/BOE.402027. eCollection 2020 Oct 1.
2
Automated Extraction of Diagnostic Criteria From Electronic Health Records for Autism Spectrum Disorders: Development, Evaluation, and Application.从电子健康记录中自动提取自闭症谱系障碍的诊断标准:开发、评估与应用
J Med Internet Res. 2018 Nov 7;20(11):e10497. doi: 10.2196/10497.
3
Effects on Text Simplification: Evaluation of Splitting Up Noun Phrases.
对文本简化的影响:名词短语拆分的评估
J Health Commun. 2016;21 Suppl 1(Suppl):18-26. doi: 10.1080/10810730.2015.1131775.
4
Sieve-based relation extraction of gene regulatory networks from biological literature.基于筛法从生物学文献中提取基因调控网络关系
BMC Bioinformatics. 2015;16 Suppl 16(Suppl 16):S1. doi: 10.1186/1471-2105-16-S16-S1. Epub 2015 Oct 30.
5
A Semantic-based Approach for Exploring Consumer Health Questions Using UMLS.一种基于语义的使用统一医学语言系统探索消费者健康问题的方法。
AMIA Annu Symp Proc. 2014 Nov 14;2014:432-41. eCollection 2014.
6
Towards semantically sensitive text clustering: a feature space modeling technology based on dimension extension.迈向语义敏感文本聚类:一种基于维度扩展的特征空间建模技术
PLoS One. 2015 Mar 20;10(3):e0117390. doi: 10.1371/journal.pone.0117390. eCollection 2015.