College of Information Science and Engineering, Xinjiang University, Urumqi, 830046, China.
Xinjiang Multilingual Information Technology Key Laboratory, Urumqi, 830046, China.
Sci Rep. 2021 Nov 11;11(1):22038. doi: 10.1038/s41598-021-01385-1.
Cross-domain sentiment classification could be attributed to two steps. The first step is used to extract the text representation, and the other is to reduce domain discrepancy. Existing methods mostly focus on learning the domain-invariant information, rarely consider using the domain-specific semantic information, which could help cross-domain sentiment classification; traditional adversarial-based models merely focus on aligning the global distribution ignore maximizing the class-specific decision boundaries. To solve these problems, we propose a context-aware semantic adaptation (CASA) network for cross-domain implicit sentiment classification (ISC). CASA can provide more semantic relationships and an accurate understanding of the emotion-changing process for ISC tasks lacking explicit emotion words. (1) To obtain inter- and intrasentence semantic associations, our model builds a context-aware heterogeneous graph (CAHG), which can aggregate the intrasentence dependency information and the intersentence node interaction information, followed by an attention mechanism that remains high-level domain-specific features. (2) Moreover, we conduct a new multigrain discriminator (MGD) to effectively reduce the interdomain distribution discrepancy and improve intradomain class discrimination. Experimental results demonstrate the effectiveness of different modules compared with existing models on the Chinese implicit emotion dataset and four public explicit datasets.
跨域情感分类可以归结为两个步骤。第一步用于提取文本表示,第二步用于减少域差异。现有的方法大多侧重于学习域不变信息,很少考虑利用特定于域的语义信息,这有助于跨域情感分类;传统基于对抗的模型仅仅关注于对齐全局分布,而忽略了最大化类特定决策边界。为了解决这些问题,我们提出了一种用于跨域隐式情感分类(ISC)的上下文感知语义自适应(CASA)网络。CASA 可以为缺乏显式情感词的 ISC 任务提供更多的语义关系和对情感变化过程的准确理解。(1)为了获得句子间和句子内的语义关联,我们的模型构建了一个上下文感知的异构图(CAHG),它可以聚合句子内的依赖信息和句子间的节点交互信息,然后是一个注意力机制,保留高级别特定于域的特征。(2)此外,我们提出了一种新的多粒度鉴别器(MGD),有效地减少了域间分布差异,并提高了域内类别的辨别能力。实验结果表明,与现有模型相比,不同模块在中文隐式情感数据集和四个公开的显式数据集上的有效性。