School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas 77030, USA.
J Am Med Inform Assoc. 2021 Jan 15;28(1):132-137. doi: 10.1093/jamia/ocaa271.
The COVID-19 pandemic has resulted in a tremendous need for access to the latest scientific information, leading to both corpora for COVID-19 literature and search engines to query such data. While most search engine research is performed in academia with rigorous evaluation, major commercial companies dominate the web search market. Thus, it is expected that commercial pandemic-specific search engines will gain much higher traction than academic alternatives, leading to questions about the empirical performance of these tools. This paper seeks to empirically evaluate two commercial search engines for COVID-19 (Google and Amazon) in comparison with academic prototypes evaluated in the TREC-COVID task. We performed several steps to reduce bias in the manual judgments to ensure a fair comparison of all systems. We find the commercial search engines sizably underperformed those evaluated under TREC-COVID. This has implications for trust in popular health search engines and developing biomedical search engines for future health crises.
新冠疫情大流行导致人们对获取最新科学信息的需求巨大,这就需要建立新冠文献语料库和搜索引擎来查询这些数据。虽然大多数搜索引擎研究都是在学术界进行的,并且有严格的评估,但大型商业公司却主导着网络搜索市场。因此,可以预期商业性的新冠专用搜索引擎将比学术性的替代方案获得更高的关注度,这就引发了关于这些工具的实证性能的问题。本文旨在对两个商业搜索引擎(谷歌和亚马逊)进行实证评估,以与 TREC-COVID 任务中评估的学术原型进行比较。我们采取了几个步骤来减少手动判断中的偏差,以确保对所有系统进行公平的比较。我们发现商业搜索引擎的表现明显逊于 TREC-COVID 评估的搜索引擎。这对人们对流行健康搜索引擎的信任和为未来的健康危机开发生物医学搜索引擎产生了影响。