Weißer Tim, Saßmannshausen Till, Ohrndorf Dennis, Burggräf Peter, Wagner Johannes
Chair for International Production Engineering and Management, University of Siegen.
MethodsX. 2020 Feb 22;7:100831. doi: 10.1016/j.mex.2020.100831. eCollection 2020.
Within a systematic literature review (SLR), researchers are confronted with vast amounts of articles from scientific databases, which have to be manually evaluated regarding their relevance for a certain field of observation. The evaluation and filtering phase of prevalent SLR methodologies is therefore time consuming and hardly expressible to the intended audience. The proposed method applies natural language processing (NLP) on article meta data and a k-means clustering algorithm to automatically convert large article corpora into a distribution of focal topics. This allows efficient filtering as well as objectifying the process through the discussion of the clustering results. Beyond that, it allows to quickly identify scientific communities and therefore provides an iterative perspective for the so far linear SLR methodology.•NLP and k-means clustering to filter large article corpora during systematic literature reviews.•Automated clustering allows filtering very efficiently as well as effectively compared to manual selection.•Presentation and discussion of the clustering results helps to objectify the nontransparent filtering step in systematic literature reviews.
在系统文献综述(SLR)中,研究人员面临着来自科学数据库的大量文章,必须人工评估这些文章与特定观察领域的相关性。因此,流行的SLR方法的评估和筛选阶段既耗时,又难以向目标受众说明。所提出的方法对文章元数据应用自然语言处理(NLP)和k均值聚类算法,以自动将大型文章语料库转换为焦点主题分布。这允许进行高效筛选,并通过对聚类结果的讨论使过程客观化。除此之外,它还能快速识别科学社群,从而为目前线性的SLR方法提供迭代视角。
NLP和k均值聚类用于在系统文献综述期间筛选大型文章语料库。
与人工选择相比,自动聚类允许非常高效且有效地进行筛选。
聚类结果的展示和讨论有助于使系统文献综述中不透明的筛选步骤客观化。