Department of Clinical Epidemiology, Leiden University Medical Center, PO Box 9600, 2300 RC, the Netherlands.
Department of Clinical Epidemiology, Leiden University Medical Center, PO Box 9600, 2300 RC, the Netherlands.
J Clin Epidemiol. 2020 May;121:55-61. doi: 10.1016/j.jclinepi.2020.01.009. Epub 2020 Jan 23.
Article full texts are often inaccessible via the standard search engines of biomedical literature, such as PubMed and Embase, which are commonly used for systematic reviews. Excluding the full-text bodies from a literature search may result in a small or selective subset of articles being included in the review because of the limited information that is available in only title, abstract, and keywords. This article describes a comparison of search strategies based on a systematic literature review of all articles published in 5 top-ranked epidemiology journals between 2000 and 2017.
Based on a text-mining approach, we studied how nine different methodological topics were mentioned across text fields (title, abstract, keywords, and text body). The following methodological topics were studied: propensity score methods, inverse probability weighting, marginal structural modeling, multiple imputation, Kaplan-Meier estimation, number needed to treat, measurement error, randomized controlled trial, and latent class analysis.
In total, 31,641 Hypertext Markup Language (HTML) files were downloaded from the journals' websites. For all methodological topics and journals, at most 50% of articles with a mention of a topic in the text body also mentioned the topic in the title, abstract, or keywords. For several topics, a gradual decrease over calendar time was observed of reporting in the title, abstract, or keywords.
Literature searches based on title, abstract, and keywords alone may not be sufficiently sensitive for studies of epidemiological research practice. This study also illustrates the potential value of full-text literature searches, provided there is accessibility of full-text bodies for literature searches.
在进行系统评价时,常用的生物医学文献标准搜索引擎(如 PubMed 和 Embase)往往无法获取全文。由于仅标题、摘要和关键词中提供的信息有限,文献检索中排除全文可能会导致纳入综述的文章数量较少或具有选择性。本文描述了一项基于对 2000 年至 2017 年发表在 5 种顶级流行病学期刊上所有文章进行的系统文献综述的搜索策略比较。
基于文本挖掘方法,我们研究了在标题、摘要、关键词和全文这 4 个文本字段中,9 种不同方法学主题是如何被提及的。研究的方法学主题包括倾向评分方法、逆概率加权、边际结构模型、多重插补、Kaplan-Meier 估计、需要治疗的人数、测量误差、随机对照试验和潜在类别分析。
总共从期刊网站下载了 31641 个超文本标记语言(HTML)文件。对于所有方法学主题和期刊,在标题、摘要或关键词中提及正文内容的文章中,最多只有 50%的文章也在标题、摘要或关键词中提及了该主题。对于几个主题,随着时间的推移,在标题、摘要或关键词中报告的数量逐渐减少。
仅基于标题、摘要和关键词的文献检索可能对流行病学研究实践的研究不够敏感。本研究还说明了全文文献检索的潜在价值,前提是有全文检索的全文获取途径。