TREC-COVID 信息检索挑战赛中使用的系统特征的比较分析。

A comparative analysis of system features used in the TREC-COVID information retrieval challenge.

机构信息

School of Medicine, Oregon Health & Science University, Portland, OR, USA.

Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA.

出版信息

J Biomed Inform. 2021 May;117:103745. doi: 10.1016/j.jbi.2021.103745. Epub 2021 Apr 6.

DOI:10.1016/j.jbi.2021.103745

PMID:33831536

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8021447/

Abstract

The COVID-19 pandemic has resulted in a rapidly growing quantity of scientific publications from journal articles, preprints, and other sources. The TREC-COVID Challenge was created to evaluate information retrieval (IR) methods and systems for this quickly expanding corpus. Using the COVID-19 Open Research Dataset (CORD-19), several dozen research teams participated in over 5 rounds of the TREC-COVID Challenge. While previous work has compared IR techniques used on other test collections, there are no studies that have analyzed the methods used by participants in the TREC-COVID Challenge. We manually reviewed team run reports from Rounds 2 and 5, extracted features from the documented methodologies, and used a univariate and multivariate regression-based analysis to identify features associated with higher retrieval performance. We observed that fine-tuning datasets with relevance judgments, MS-MARCO, and CORD-19 document vectors was associated with improved performance in Round 2 but not in Round 5. Though the relatively decreased heterogeneity of runs in Round 5 may explain the lack of significance in that round, fine-tuning has been found to improve search performance in previous challenge evaluations by improving a system's ability to map relevant queries and phrases to documents. Furthermore, term expansion was associated with improvement in system performance, and the use of the narrative field in the TREC-COVID topics was associated with decreased system performance in both rounds. These findings emphasize the need for clear queries in search. While our study has some limitations in its generalizability and scope of techniques analyzed, we identified some IR techniques that may be useful in building search systems for COVID-19 using the TREC-COVID test collections.

摘要

新冠疫情导致了大量的科学文献涌现，包括期刊文章、预印本和其他来源。TREC-COVID 挑战赛的目的是评估针对这个快速扩展语料库的信息检索（IR）方法和系统。使用 COVID-19 开放研究数据集（CORD-19），几十支研究团队参加了 TREC-COVID 挑战赛的五轮比赛。虽然之前的工作已经比较了其他测试集上使用的 IR 技术，但没有研究分析过 TREC-COVID 挑战赛参与者使用的方法。我们手动审查了第 2 轮和第 5 轮的团队运行报告，从记录的方法中提取特征，并使用单变量和多变量回归分析来确定与更高检索性能相关的特征。我们观察到，使用相关性判断、MS-MARCO 和 CORD-19 文档向量对数据集进行微调与第 2 轮的性能提升有关，但与第 5 轮无关。尽管第 5 轮的运行相对缺乏异质性可能解释了这一轮的结果不显著，但微调已被发现可以通过提高系统将相关查询和短语映射到文档的能力来改善搜索性能，这在之前的挑战评估中已经得到了验证。此外，词项扩展与系统性能的提高有关，而在 TREC-COVID 主题中使用叙述字段与两轮的系统性能下降有关。这些发现强调了在搜索中需要明确的查询。虽然我们的研究在其可推广性和分析技术的范围上存在一些限制，但我们确定了一些 IR 技术，这些技术可能有助于使用 TREC-COVID 测试集构建 COVID-19 的搜索系统。