Beese Dominik, Altunbaş Begüm, Güzeler Görkem, Eger Steffen
Technische Universität Darmstadt, Darmstadt, Hessen, Germany.
Technische Universität München, München, Bayern, Germany.
R Soc Open Sci. 2023 Mar 8;10(3):221159. doi: 10.1098/rsos.221159. eCollection 2023 Mar.
In this paper, we classify scientific articles in the domain of natural language processing (NLP) and machine learning (ML), as core subfields of artificial intelligence (AI), into whether (i) they extend the current state-of-the-art by the introduction of novel techniques which beat existing models or whether (ii) they mainly criticize the existing state-of-the-art, i.e. that it is deficient with respect to some property (e.g. wrong evaluation, wrong datasets, misleading task specification). We refer to contributions under (i) as having a 'positive stance' and contributions under (ii) as having a 'negative stance' (to related work). We annotate over 1.5 k papers from NLP and ML to train a SciBERT-based model to automatically predict the stance of a paper based on its title and abstract. We then analyse large-scale trends on over 41 k papers from the last approximately 35 years in NLP and ML, finding that papers have become substantially more positive over time, but negative papers also got more negative and we observe considerably more negative papers in recent years. Negative papers are also more influential in terms of citations they receive.
在本文中,我们将作为人工智能(AI)核心子领域的自然语言处理(NLP)和机器学习(ML)领域的科学文章,分类为:(i)它们是否通过引入超越现有模型的新技术来扩展当前的技术水平;或者(ii)它们是否主要批评当前的技术水平,即它在某些属性方面存在缺陷(例如错误的评估、错误的数据集、误导性的任务规范)。我们将(i)类贡献称为具有“积极立场”,将(ii)类贡献称为具有“消极立场”(相对于相关工作)。我们对来自NLP和ML的1500多篇论文进行注释,以训练一个基于SciBERT的模型,以便根据论文的标题和摘要自动预测论文的立场。然后,我们分析了过去约35年中来自NLP和ML的41000多篇论文的大规模趋势,发现随着时间的推移,论文变得更加积极,但消极论文也变得更加消极,并且我们观察到近年来消极论文的数量显著增加。消极论文在获得的引用方面也更具影响力。