Suppr超能文献

使用新数据和集成模型改进可持续发展目标的自动标注。

Using novel data and ensemble models to improve automated labeling of Sustainable Development Goals.

作者信息

Wulff Dirk U, Meier Dominik S, Mata Rui

机构信息

Max Planck Institute for Human Development, Berlin, Germany.

University of Basel, Basel, Switzerland.

出版信息

Sustain Sci. 2024;19(5):1773-1787. doi: 10.1007/s11625-024-01516-3. Epub 2024 Jul 24.

Abstract

A number of labeling systems based on text have been proposed to help monitor work on the United Nations (UN) Sustainable Development Goals (SDGs). Here, we present a systematic comparison of prominent SDG labeling systems using a variety of text sources and show that these differ considerably in their sensitivity (i.e., true-positive rate) and specificity (i.e., true-negative rate), have systematic biases (e.g., are more sensitive to specific SDGs relative to others), and are susceptible to the type and amount of text analyzed. We then show that an ensemble model that pools SDG labeling systems alleviates some of these limitations, exceeding the performance of the individual SDG labeling systems considered. We conclude that researchers and policymakers should care about the choice of the SDG labeling system and that ensemble methods should be favored when drawing conclusions about the absolute and relative prevalence of work on the SDGs based on automated methods.

摘要

已经提出了一些基于文本的标签系统来帮助监测联合国(UN)可持续发展目标(SDG)的工作进展。在此,我们使用各种文本来源对突出的SDG标签系统进行了系统比较,结果表明这些系统在敏感性(即真阳性率)和特异性(即真阴性率)方面存在显著差异,存在系统偏差(例如,相对于其他目标,对特定SDG更敏感),并且易受所分析文本的类型和数量的影响。然后我们表明,一种汇总SDG标签系统的集成模型减轻了其中一些局限性,其性能超过了所考虑的单个SDG标签系统。我们得出结论,研究人员和政策制定者应关注SDG标签系统的选择,并且在基于自动化方法得出关于SDG工作的绝对和相对流行程度的结论时,应优先采用集成方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc7c/11366727/ee697e431fb6/11625_2024_1516_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验