Associate Professor, School of Information Science, University of Kentucky, Lexington, KY,
Assistant Professor, School of Information Science, University of Kentucky, Lexington, KY,
J Med Libr Assoc. 2019 Jul;107(3):364-373. doi: 10.5195/jmla.2019.622. Epub 2019 Jul 1.
Hypothetically, content in MEDLINE records is consistent across multiple platforms. Though platforms have different interfaces and requirements for query syntax, results should be similar when the syntax is controlled for across the platforms. The authors investigated how search result counts varied when searching records among five MEDLINE platforms.
We created 29 sets of search queries targeting various metadata fields and operators. Within search sets, we adapted 5 distinct, compatible queries to search 5 MEDLINE platforms (PubMed, ProQuest, EBSCO, Web of Science, and Ovid), totaling 145 final queries. The 5 queries were designed to be logically and semantically equivalent and were modified only to match platform syntax requirements. We analyzed the result counts and compared PubMed's MEDLINE result counts to result counts from the other platforms. We identified outliers by measuring the result count deviations using modified z-scores centered around PubMed's MEDLINE results.
Web of Science and ProQuest searches were the most likely to deviate from the equivalent PubMed searches. EBSCO and Ovid were less likely to deviate from PubMed searches. Ovid's results were the most consistent with PubMed's but appeared to apply an indexing algorithm that resulted in lower retrieval sets among equivalent searches in PubMed. Web of Science exhibited problems with exploding or not exploding Medical Subject Headings (MeSH) terms.
Platform enhancements among interfaces affect record retrieval and challenge the expectation that MEDLINE platforms should, by default, be treated as MEDLINE. Substantial inconsistencies in search result counts, as demonstrated here, should raise concerns about the impact of platform-specific influences on search results.
从理论上讲,MEDLINE 记录中的内容在多个平台上是一致的。尽管平台具有不同的界面和查询语法要求,但在控制语法跨平台的情况下,结果应该是相似的。作者研究了在五个 MEDLINE 平台中搜索记录时,搜索结果数量的变化情况。
我们创建了 29 组针对各种元数据字段和运算符的搜索查询。在搜索集中,我们改编了 5 个不同的、兼容的查询,以搜索 5 个 MEDLINE 平台(PubMed、ProQuest、EBSCO、Web of Science 和 Ovid),总共 145 个最终查询。这 5 个查询旨在在逻辑和语义上等效,并且仅进行了修改以匹配平台语法要求。我们分析了结果数量,并比较了 PubMed 的 MEDLINE 结果数量与其他平台的结果数量。我们通过使用围绕 PubMed 的 MEDLINE 结果中心的修改后的 z 分数来测量结果数量偏差来确定异常值。
Web of Science 和 ProQuest 搜索最有可能偏离等效的 PubMed 搜索。EBSCO 和 Ovid 不太可能偏离 PubMed 搜索。Ovid 的结果与 PubMed 的结果最一致,但似乎应用了一种索引算法,导致在 PubMed 中进行等效搜索时检索集较低。Web of Science 存在 Medical Subject Headings(MeSH)术语爆炸或不爆炸的问题。
界面之间的平台增强会影响记录检索,并挑战 MEDLINE 平台默认应被视为 MEDLINE 的期望。这里展示的搜索结果数量的实质性不一致性,应该引起对平台特定影响对搜索结果的影响的关注。