Department of Classical Philology and Italian Studies, Research Centre for Open Scholarly Metadata, University of Bologna, Bologna, Italy.
Department of Classical Philology and Italian Studies, Digital Humanities Advanced Research Centre (/DH.arc), University of Bologna, Bologna, Italy.
PLoS One. 2022 Jul 19;17(7):e0270872. doi: 10.1371/journal.pone.0270872. eCollection 2022.
In this article, we present a methodology which takes as input a collection of retracted articles, gathers the entities citing them, characterizes such entities according to multiple dimensions (disciplines, year of publication, sentiment, etc.), and applies a quantitative and qualitative analysis on the collected values. The methodology is composed of four phases: (1) identifying, retrieving, and extracting basic metadata of the entities which have cited a retracted article, (2) extracting and labeling additional features based on the textual content of the citing entities, (3) building a descriptive statistical summary based on the collected data, and finally (4) running a topic modeling analysis. The goal of the methodology is to generate data and visualizations that help understanding possible behaviors related to retraction cases. We present the methodology in a structured step-by-step form following its four phases, discuss its limits and possible workarounds, and list the planned future improvements.
在本文中,我们提出了一种方法,该方法以一系列已撤回的文章作为输入,收集引用它们的实体,根据多个维度(学科、发表年份、情感等)对这些实体进行特征描述,并对收集到的值进行定量和定性分析。该方法由四个阶段组成:(1)识别、检索和提取引用已撤回文章的实体的基本元数据,(2)根据引用实体的文本内容提取和标记附加特征,(3)基于收集的数据构建描述性统计摘要,最后(4)运行主题建模分析。该方法的目的是生成有助于理解与撤稿案例相关的可能行为的数据和可视化结果。我们按照其四个阶段以结构化的分步形式呈现该方法,讨论其局限性和可能的解决方法,并列出计划的未来改进。