• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用无标题评估协议评估生物医学搜索引擎的客观和自动化协议。

Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols.

作者信息

Campagne Fabien

机构信息

HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, New York, NY 10021, USA.

出版信息

BMC Bioinformatics. 2008 Feb 29;9:132. doi: 10.1186/1471-2105-9-132.

DOI:10.1186/1471-2105-9-132
PMID:18312673
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2292696/
Abstract

BACKGROUND

The evaluation of information retrieval techniques has traditionally relied on human judges to determine which documents are relevant to a query and which are not. This protocol is used in the Text Retrieval Evaluation Conference (TREC), organized annually for the past 15 years, to support the unbiased evaluation of novel information retrieval approaches. The TREC Genomics Track has recently been introduced to measure the performance of information retrieval for biomedical applications.

RESULTS

We describe two protocols for evaluating biomedical information retrieval techniques without human relevance judgments. We call these protocols No Title Evaluation (NT Evaluation). The first protocol measures performance for focused searches, where only one relevant document exists for each query. The second protocol measures performance for queries expected to have potentially many relevant documents per query (high-recall searches). Both protocols take advantage of the clear separation of titles and abstracts found in Medline. We compare the performance obtained with these evaluation protocols to results obtained by reusing the relevance judgments produced in the 2004 and 2005 TREC Genomics Track and observe significant correlations between performance rankings generated by our approach and TREC. Spearman's correlation coefficients in the range of 0.79-0.92 are observed comparing bpref measured with NT Evaluation or with TREC evaluations. For comparison, coefficients in the range 0.86-0.94 can be observed when evaluating the same set of methods with data from two independent TREC Genomics Track evaluations. We discuss the advantages of NT Evaluation over the TRels and the data fusion evaluation protocols introduced recently.

CONCLUSION

Our results suggest that the NT Evaluation protocols described here could be used to optimize some search engine parameters before human evaluation. Further research is needed to determine if NT Evaluation or variants of these protocols can fully substitute for human evaluations.

摘要

背景

传统上,信息检索技术的评估依赖于人工评判来确定哪些文档与查询相关,哪些不相关。在过去15年中每年举办的文本检索评估会议(TREC)中使用此协议,以支持对新型信息检索方法进行公正的评估。最近引入了TREC基因组学赛道来衡量生物医学应用中信息检索的性能。

结果

我们描述了两种无需人工相关性判断即可评估生物医学信息检索技术的协议。我们将这些协议称为无标题评估(NT评估)。第一种协议衡量聚焦搜索的性能,其中每个查询只有一篇相关文档。第二种协议衡量预期每个查询可能有许多相关文档的查询的性能(高召回率搜索)。这两种协议都利用了Medline中标题和摘要的清晰分离。我们将使用这些评估协议获得的性能与通过重用2004年和2005年TREC基因组学赛道产生的相关性判断获得的结果进行比较,并观察我们的方法生成的性能排名与TREC之间的显著相关性。使用NT评估或TREC评估测量的bpref进行比较时,观察到斯皮尔曼相关系数在0.79 - 0.92范围内。相比之下,使用来自两个独立的TREC基因组学赛道评估的数据评估同一组方法时,可以观察到系数在0.86 - 0.94范围内。我们讨论了NT评估相对于最近引入的TRels和数据融合评估协议的优势。

结论

我们的结果表明,此处描述的NT评估协议可用于在人工评估之前优化一些搜索引擎参数。需要进一步研究以确定NT评估或这些协议的变体是否可以完全替代人工评估。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ceb/2292696/2fd37db33fa6/1471-2105-9-132-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ceb/2292696/b74c2b172733/1471-2105-9-132-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ceb/2292696/a2b5f3560e5f/1471-2105-9-132-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ceb/2292696/ed4880cfbbda/1471-2105-9-132-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ceb/2292696/a52454d4f891/1471-2105-9-132-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ceb/2292696/2fd37db33fa6/1471-2105-9-132-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ceb/2292696/b74c2b172733/1471-2105-9-132-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ceb/2292696/a2b5f3560e5f/1471-2105-9-132-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ceb/2292696/ed4880cfbbda/1471-2105-9-132-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ceb/2292696/a52454d4f891/1471-2105-9-132-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ceb/2292696/2fd37db33fa6/1471-2105-9-132-5.jpg

相似文献

1
Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols.使用无标题评估协议评估生物医学搜索引擎的客观和自动化协议。
BMC Bioinformatics. 2008 Feb 29;9:132. doi: 10.1186/1471-2105-9-132.
2
PageRank without hyperlinks: reranking with PubMed related article networks for biomedical text retrieval.无超链接的PageRank算法:利用PubMed相关文章网络进行生物医学文本检索的重排。
BMC Bioinformatics. 2008 Jun 6;9:270. doi: 10.1186/1471-2105-9-270.
3
Essie: a concept-based search engine for structured biomedical text.Essie:一个用于结构化生物医学文本的基于概念的搜索引擎。
J Am Med Inform Assoc. 2007 May-Jun;14(3):253-63. doi: 10.1197/jamia.M2233. Epub 2007 Feb 28.
4
Evaluating performance of biomedical image retrieval systems--an overview of the medical image retrieval task at ImageCLEF 2004-2013.评估生物医学图像检索系统的性能——2004 - 2013年ImageCLEF医学图像检索任务综述
Comput Med Imaging Graph. 2015 Jan;39:55-61. doi: 10.1016/j.compmedimag.2014.03.004. Epub 2014 Mar 27.
5
MedScan, a natural language processing engine for MEDLINE abstracts.MedScan,一款用于医学在线数据库摘要的自然语言处理引擎。
Bioinformatics. 2003 Sep 1;19(13):1699-706. doi: 10.1093/bioinformatics/btg207.
6
A Part-Of-Speech term weighting scheme for biomedical information retrieval.一种用于生物医学信息检索的词性术语加权方案。
J Biomed Inform. 2016 Oct;63:379-389. doi: 10.1016/j.jbi.2016.08.026. Epub 2016 Sep 1.
7
An entity tagger for recognizing acquired genomic variations in cancer literature.一种用于识别癌症文献中获得性基因组变异的实体标记器。
Bioinformatics. 2004 Nov 22;20(17):3249-51. doi: 10.1093/bioinformatics/bth350. Epub 2004 Jun 4.
8
Investigation into biomedical literature classification using support vector machines.使用支持向量机对生物医学文献分类的研究。
Proc IEEE Comput Syst Bioinform Conf. 2005:366-74. doi: 10.1109/csb.2005.36.
9
A2A: a platform for research in biomedical literature search.A2A:一个用于生物医学文献检索研究的平台。
BMC Bioinformatics. 2020 Dec 21;21(Suppl 19):572. doi: 10.1186/s12859-020-03894-8.
10
A quantitative model for linking two disparate sets of articles in MEDLINE.一种用于链接MEDLINE中两组不同文章的定量模型。
Bioinformatics. 2007 Jul 1;23(13):1658-65. doi: 10.1093/bioinformatics/btm161. Epub 2007 Apr 26.

引用本文的文献

1
Urine proteomics for profiling of human disease using high accuracy mass spectrometry.使用高精度质谱法进行尿液蛋白质组学分析以研究人类疾病
Proteomics Clin Appl. 2009 Sep 1;3(9):1052-1061. doi: 10.1002/prca.200900008.

本文引用的文献

1
The TREC 2004 genomics track categorization task: classifying full text biomedical documents.2004年文本检索会议(TREC)基因组学专题分类任务:对生物医学全文文档进行分类。
J Biomed Discov Collab. 2006 Mar 14;1:4. doi: 10.1186/1747-5333-1-4.
2
Inducible nitric oxide synthase binds, S-nitrosylates, and activates cyclooxygenase-2.诱导型一氧化氮合酶结合、S-亚硝基化并激活环氧化酶-2。
Science. 2005 Dec 23;310(5756):1966-70. doi: 10.1126/science.1119407.
3
Relevance similarity: an alternative means to monitor information retrieval systems.相关性相似度:一种监测信息检索系统的替代方法。
Biomed Digit Libr. 2005 Jul 20;2:6. doi: 10.1186/1742-5581-2-6.