The Yancheng School of Clinical Medicine of Nanjing Medical University, Jiangsu 224008, China.
Quality Management Division, Yancheng Third People's Hospital, Jiangsu 224008, China.
Math Biosci Eng. 2023 Jan;20(2):1981-1992. doi: 10.3934/mbe.2023091. Epub 2022 Nov 9.
Text classification is a fundamental task in natural language processing. The Chinese text classification task suffers from sparse text features, ambiguity in word segmentation, and poor performance of classification models. A text classification model is proposed based on the self-attention mechanism combined with CNN and LSTM. The proposed model uses word vectors as input to a dual-channel neural network structure, using multiple CNNs to extract the N-Gram information of different word windows and enrich the local feature representation through the concatenation operation, the BiLSTM is used to extract the semantic association information of the context to obtain the high-level feature representation at the sentence level. The output of BiLSTM is feature weighted with self-attention to reduce the influence of noisy features. The outputs of the dual channels are concatenated and fed into the softmax layer for classification. The results of the multiple comparison experiments showed that the DCCL model obtained 90.07% and 96.26% F1-score on the Sougou and THUNews datasets, respectively. Compared to the baseline model, the improvement was 3.24% and 2.19%, respectively. The proposed DCCL model can alleviate the problem of CNN losing word order information and the gradient of BiLSTM when processing text sequences, effectively integrate local and global text features, and highlight key information. The classification performance of the DCCL model is excellent and suitable for text classification tasks.
文本分类是自然语言处理中的一项基本任务。中文文本分类任务面临文本特征稀疏、分词歧义、分类模型性能差等问题。提出了一种基于自注意力机制与 CNN 和 LSTM 相结合的文本分类模型。该模型使用词向量作为输入,采用双通道神经网络结构,使用多个 CNN 提取不同词窗的 N-Gram 信息,并通过拼接操作丰富局部特征表示,使用 BiLSTM 提取上下文的语义关联信息,获取句子级别的高层特征表示。通过自注意力对 BiLSTM 的输出进行特征加权,以减少噪声特征的影响。双通道的输出进行拼接并送入 softmax 层进行分类。多项对比实验的结果表明,DCCL 模型在 Sougou 和 THUNews 数据集上的 F1 值分别达到 90.07%和 96.26%,与基线模型相比,分别提高了 3.24%和 2.19%。所提出的 DCCL 模型可以缓解 CNN 在处理文本序列时丢失词序信息和 BiLSTM 的梯度问题,有效整合局部和全局文本特征,并突出关键信息。DCCL 模型的分类性能优异,适用于文本分类任务。