使用DistilBERT和ALBERT进行优化的知识蒸馏以实现高效的社交媒体情感识别

Optimised knowledge distillation for efficient social media emotion recognition using DistilBERT and ALBERT.

作者信息

Hussain Muhammad, Chen Caikou, Hussain Muzammil, Anwar Muhammad, Abaker Mohammed, Abdelmaboud Abdelzahir, Yamin Iqra

机构信息

College of Information and Artificial Intelligence, Yangzhou University, Yangzhou, 225000, People's Republic of China.

Department of Software Engineering, Faculty of Information Technology, Al-Ahliyya Amman University, Amman, 19328, Jordan.

出版信息

Sci Rep. 2025 Aug 17;15(1):30104. doi: 10.1038/s41598-025-16001-9.

DOI:10.1038/s41598-025-16001-9

PMID:40820218

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12358625/

Abstract

Accurate emotion recognition in social media text is critical for applications such as sentiment analysis, mental health monitoring, and human-computer interaction. However, existing approaches face challenges like computational complexity and class imbalance, limiting their deployment in resource-constrained environments. While transformer-based models achieve state-of-the-art performance, their size and latency hinder real-time applications. To address these issues, we propose a novel knowledge distillation framework that transfers knowledge from a fine-tuned BERT-base teacher model to lightweight DistilBERT and ALBERT student models, optimised for efficient emotion recognition. Our approach integrates a hybrid loss function combining focal loss and Kullback-Leibler (KL) divergence to enhance minority class recognition, attention-head alignment for effective contextual knowledge transfer, and semantic-preserving data augmentation to mitigate class imbalance. Experiments on two datasets, Twitter Emotions 416 K samples, six classes, and Social Media Emotion 75 K samples, five classes, show that our distilled models achieve near-teacher performance 97.35% and 73.86% accuracy, respectively. with only a < 1% and < 6% accuracy drop, while reducing model size by 40% and inference latency by 3.2×. Notably, our method significantly improves F1-scores for minority classes. Our work sets a new state-of-the-art in efficient emotion recognition, enabling practical deployment in edge computing and mobile applications.

摘要

在社交媒体文本中进行准确的情感识别对于诸如情感分析、心理健康监测和人机交互等应用至关重要。然而，现有方法面临计算复杂度和类别不平衡等挑战，限制了它们在资源受限环境中的部署。虽然基于Transformer的模型取得了最优性能，但其规模和延迟阻碍了实时应用。为了解决这些问题，我们提出了一种新颖的知识蒸馏框架，该框架将知识从经过微调的BERT-base教师模型转移到轻量级的DistilBERT和ALBERT学生模型，这些模型针对高效情感识别进行了优化。我们的方法集成了一种混合损失函数，该函数结合了焦点损失和库尔贝克-莱布勒（KL）散度以增强少数类别的识别，通过注意力头对齐实现有效的上下文知识转移，并通过语义保留数据增强来缓解类别不平衡。在两个数据集上进行的实验，即Twitter Emotions（416K个样本，六个类别）和Social Media Emotion（75K个样本，五个类别），结果表明我们的蒸馏模型分别达到了接近教师模型的性能，准确率分别为97.35%和73.86%，准确率仅下降了<1%和<6%，同时模型规模减少了40%，推理延迟降低了3.2倍。值得注意的是，我们的方法显著提高了少数类别的F1分数。我们的工作在高效情感识别方面开创了新的最优水平，能够在边缘计算和移动应用中实现实际部署。