Faculty of Engineering, Tokushima University, Tokushima, Japan.
PLoS One. 2018 Apr 6;13(4):e0194136. doi: 10.1371/journal.pone.0194136. eCollection 2018.
In this paper, we propose an emotion separated method(SeTF·IDF) to assign the emotion labels of sentences with different values, which has a better visual effect compared with the values represented by TF·IDF in the visualization of a multi-label Chinese emotional corpus Ren_CECps. Inspired by the enormous improvement of the visualization map propelled by the changed distances among the sentences, we being the first group utilizes the Word Mover's Distance(WMD) algorithm as a way of feature representation in Chinese text emotion classification. Our experiments show that both in 80% for training, 20% for testing and 50% for training, 50% for testing experiments of Ren_CECps, WMD features get the best f1 scores and have a greater increase compared with the same dimension feature vectors obtained by dimension reduction TF·IDF method. Compared experiments in English corpus also show the efficiency of WMD features in the cross-language field.
在本文中,我们提出了一种情感分离方法(SeTF·IDF),用于为具有不同值的句子分配情感标签,与多标签中文情感语料库 Ren_CECps 中 TF·IDF 表示的数值相比,该方法具有更好的可视化效果。受句子之间距离变化所带来的可视化地图巨大改进的启发,我们是第一组利用词动距离(WMD)算法作为中文文本情感分类的特征表示方法的小组。实验表明,无论是在 Ren_CECps 的 80%用于训练、20%用于测试,还是 50%用于训练、50%用于测试的实验中,WMD 特征都获得了最佳的 f1 分数,与通过降维 TF·IDF 方法获得的相同维数特征向量相比有了更大的提高。在英文语料库中的对比实验也表明了 WMD 特征在跨语言领域的有效性。