Suppr超能文献

社交媒体文本中希望话语细微维度的多语言识别。

Multilingual identification of nuanced dimensions of hope speech in social media texts.

作者信息

Sidorov Grigori, Balouchzahi Fazlourrahman, Ramos Luis, Gómez-Adorno Helena, Gelbukh Alexander

机构信息

Instituto Politécnico Nacional (IPN), Centro de Investigación en Computación (CIC), Mexico City, Mexico.

Universidad Nacional Autónoma de México (UNAM), Instituto de Investigaciones en Matemáticas Aplicadas y Sistemas (IIMAS), Mexico City, Mexico.

出版信息

Sci Rep. 2025 Jul 23;15(1):26783. doi: 10.1038/s41598-025-10683-x.

Abstract

Hope plays a crucial role in human psychology and well-being, yet its expression and detection across languages remain underexplored in natural language processing (NLP). This study presents MIND-HOPE, the first-ever multiclass hope speech detection datasets for Spanish and German, collected from Twitter. The annotated dataset comprise 19,183 Spanish tweets and 21,043 German tweets, categorized into four classes: Generalized Hope, Realistic Hope, Unrealistic Hope, and Not Hope. The paper also provides a comprehensive review of existing hope speech datasets and detection techniques, and conducts a comparative evaluation of traditional machine learning, deep learning, and transformer-based approaches. Experimental results, obtained using 5-fold cross-validation, show that monolingual transformer models (e.g., bert-base-german-dbmdz-uncased and bert-base-spanish-wwm-uncased) consistently outperform multilingual models (e.g., mBERT, XLM-RoBERTa) in both binary and multiclass hope detection tasks. These findings underscore the value of language-specific fine-tuning for nuanced affective computing tasks. This study advances sentiment analysis by addressing a novel and underrepresented affective dimension-hope, and proposes robust multilingual benchmarks for future research. Theoretically, it contributes to a deeper understanding of hope as a complex emotional state with practical implications for mental health monitoring, social well-being analysis, and positive content recommendation in online spaces. By modeling hope across languages and categories, this research opens new directions in affective NLP and cross-cultural computational social science.

摘要

希望在人类心理和幸福中起着至关重要的作用,然而在自然语言处理(NLP)中,其在跨语言中的表达和检测仍未得到充分探索。本研究展示了MIND-HOPE,这是首个从推特收集的用于西班牙语和德语的多类希望语音检测数据集。注释数据集包括19183条西班牙语推文和21043条德语推文,分为四类:广义希望、现实希望、不现实希望和无希望。本文还对现有的希望语音数据集和检测技术进行了全面综述,并对传统机器学习、深度学习和基于Transformer的方法进行了比较评估。使用5折交叉验证获得的实验结果表明,在二元和多类希望检测任务中,单语Transformer模型(如bert-base-german-dbmdz-uncased和bert-base-spanish-wwm-uncased)始终优于多语言模型(如mBERT、XLM-RoBERTa)。这些发现强调了针对细微情感计算任务进行特定语言微调的价值。本研究通过解决一个新颖且代表性不足的情感维度——希望,推进了情感分析,并为未来研究提出了强大的多语言基准。从理论上讲,它有助于更深入地理解希望作为一种复杂的情绪状态,对心理健康监测、社会幸福感分析以及在线空间中的积极内容推荐具有实际意义。通过跨语言和类别对希望进行建模,本研究为情感NLP和跨文化计算社会科学开辟了新方向。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验