Kilic Yasir, Tulu Cagatay Neftali
Computer Engineering Department, Adana Alparslan Turkes Science and Technology University, Adana, Turkey.
Software Engineering Department, Adana Alparslan Turkes Science and Technology University, Adana, Turkey.
PeerJ Comput Sci. 2025 Mar 21;11:e2729. doi: 10.7717/peerj-cs.2729. eCollection 2025.
Sentiment classification is a widely studied problem in natural language processing (NLP) that focuses on identifying the sentiment expressed in text and categorizing it into predefined classes, such as positive, negative, or neutral. As sentiment classification solutions are increasingly integrated into real-world applications, such as analyzing customer feedback in business reviews (., hotel reviews) or monitoring public sentiment on social media, the importance of both their accuracy and explainability has become widely acknowledged. In the Turkish language, this problem becomes more challenging due to the complex agglutinative structure of the language. Many solutions have been proposed in the literature to solve this problem. However, it is observed that the solutions are generally based on black-box models. Therefore the explainability requirement of such artificial intelligence (AI) models has become as important as the accuracy of the model. This has further increased the importance of studies based on the explainability of the AI model's decision. Although most existing studies prefer to explain the model decision in terms of the importance of a single feature/token, this does not provide full explainability due to the complex lexical and semantic relations in the texts. To fill these gaps in the Turkish NLP literature, in this article, we propose a graph-aware explainability solution for Turkish sentiment analysis named TurkSentGraphExp. The solution provides both classification and explainability for sentiment classification of Turkish texts by considering the semantic structure of suffixes, accommodating the agglutinative nature of Turkish, and capturing complex relationships through graph representations. Unlike traditional black-box learning models, this framework leverages an inherent graph representation learning (GRL) model to introduce rational phrase-level explainability. We conduct several experiments to quantify the effectiveness of this framework. The experimental results indicate that the proposed model achieves a 10 to 40% improvement in explainability compared to state-of-the-art methods across varying sparsity levels, further highlighting its effectiveness and robustness. Moreover, the experimental results, supported by a case study, reveal that the semantic relationships arising from affixes in Turkish texts can be identified as part of the model's decision-making process, demonstrating the proposed solution's ability to effectively capture the agglutinative structure of Turkish.
情感分类是自然语言处理(NLP)中一个被广泛研究的问题,它专注于识别文本中表达的情感并将其分类为预定义的类别,如积极、消极或中性。随着情感分类解决方案越来越多地集成到实际应用中,如分析商业评论(如酒店评论)中的客户反馈或监测社交媒体上的公众情绪,其准确性和可解释性的重要性已得到广泛认可。在土耳其语中,由于该语言复杂的黏着结构,这个问题变得更具挑战性。文献中已经提出了许多解决方案来解决这个问题。然而,可以观察到这些解决方案通常基于黑箱模型。因此,这种人工智能(AI)模型的可解释性要求已变得与模型的准确性同样重要。这进一步增加了基于AI模型决策可解释性的研究的重要性。尽管大多数现有研究倾向于根据单个特征/词元的重要性来解释模型决策,但由于文本中复杂的词汇和语义关系,这并不能提供完全的可解释性。为了填补土耳其语NLP文献中的这些空白,在本文中,我们提出了一种用于土耳其语情感分析的名为TurkSentGraphExp的图感知可解释性解决方案。该解决方案通过考虑后缀的语义结构、适应土耳其语的黏着性质并通过图表示捕获复杂关系,为土耳其语文本的情感分类提供分类和可解释性。与传统的黑箱学习模型不同,这个框架利用一个固有的图表示学习(GRL)模型来引入合理的短语级可解释性。我们进行了多项实验来量化这个框架的有效性。实验结果表明,与现有最先进方法相比,所提出的模型在不同稀疏度水平下的可解释性提高了10%至40%,进一步突出了其有效性和鲁棒性。此外,一个案例研究支持的实验结果表明,土耳其语文本中词缀产生的语义关系可以被识别为模型决策过程的一部分,证明了所提出的解决方案能够有效捕获土耳其语的黏着结构。