Zhu Rong, Gao Hua-Hui, Wang Yong
School of Computer Science, Qufu Normal University, Rizhao, China.
Laboratory Experimental Teaching and Equipment Management Center, Qufu Normal University, Rizhao, China.
PeerJ Comput Sci. 2024 Aug 19;10:e2240. doi: 10.7717/peerj-cs.2240. eCollection 2024.
The majority of extant methodologies for text classification prioritize the extraction of feature representations from texts with high degrees of distinction, a process that may result in computational inefficiencies. To address this limitation, the current study proposes a novel approach by directly leveraging label information to construct text representations. This integration aims to optimize the use of label data alongside textual content.
The methodology initiated with separate pre-processing of texts and labels, followed by encoding through a projection layer. This research then utilized a conventional self-attention model enhanced by instance normalization (IN) and Gaussian Error Linear Unit (GELU) functions to assess emotional valences in review texts. An advanced self-attention mechanism was further developed to enable the efficient integration of text and label information. In the final stage, an adaptive label encoder was employed to extract relevant label information from the combined text-label data efficiently.
Empirical evaluations demonstrate that the proposed model achieves a significant improvement in classification performance, outperforming existing methodologies. This enhancement is quantitatively evidenced by its superior micro-F1 score, indicating the efficacy of integrating label information into text classification processes. This suggests that the model not only addresses computational inefficiencies but also enhances the accuracy of text classification.
大多数现有的文本分类方法优先从具有高度区分度的文本中提取特征表示,这一过程可能导致计算效率低下。为解决这一局限性,本研究提出一种新颖的方法,即直接利用标签信息来构建文本表示。这种整合旨在优化标签数据与文本内容的使用。
该方法首先对文本和标签进行单独预处理,然后通过投影层进行编码。本研究随后利用通过实例归一化(IN)和高斯误差线性单元(GELU)函数增强的传统自注意力模型来评估评论文本中的情感效价。进一步开发了一种先进的自注意力机制,以实现文本和标签信息的有效整合。在最后阶段,采用自适应标签编码器从组合的文本-标签数据中高效提取相关标签信息。
实证评估表明,所提出的模型在分类性能上取得了显著提升,优于现有方法。这种提升在其卓越的微F1分数上得到了定量证明,表明将标签信息整合到文本分类过程中的有效性。这表明该模型不仅解决了计算效率低下的问题,还提高了文本分类的准确性。