Suppr超能文献

基于多模型的深度学习框架,用于处理不平衡且超小规模数据集的短文本多分类问题。

A Multimodel-Based Deep Learning Framework for Short Text Multiclass Classification with the Imbalanced and Extremely Small Data Set.

机构信息

China University of Mining and Technology, School of Computer Science and Technology, Xuzhou, China.

出版信息

Comput Intell Neurosci. 2022 Oct 6;2022:7183207. doi: 10.1155/2022/7183207. eCollection 2022.

Abstract

Text classification plays an important role in many practical applications. In the real world, there are extremely small datasets. Most existing methods adopt pretrained neural network models to handle this kind of dataset. However, these methods are either difficult to deploy on mobile devices because of their large output size or cannot fully extract the deep semantic information between phrases and clauses. This paper proposes a multimodel-based deep learning framework for short-text multiclass classification with an imbalanced and extremely small dataset. Our framework mainly includes five layers: the encoder layer, the word-level LSTM network layer, the sentence-level LSTM network layer, the max-pooling layer, and the SoftMax layer. The encoder layer uses DistilBERT to obtain context-sensitive dynamic word vectors that are difficult to represent in traditional feature engineering methods. Since the transformer part of this layer is distilled, our framework is compressed. Then, we use the next two layers to extract deep semantic information. The output of the encoder layer is sent to a bidirectional LSTM network, and the feature matrix is extracted hierarchically through the LSTM at the word and sentence level to obtain the fine-grained semantic representation. After that, the max-pooling layer converts the feature matrix into a lower-dimensional matrix, preserving only the obvious features. Finally, the feature matrix is taken as the input of a fully connected SoftMax layer, which contains a function that can convert the predicted linear vector into the output value as the probability of the text in each classification. Extensive experiments on two public benchmarks demonstrate the effectiveness of our proposed approach on an extremely small dataset. It retains the state-of-the-art baseline performance in terms of precision, recall, accuracy, and F1 score, and through the model size, training time, and convergence epoch, we can conclude that our method can be deployed faster and lighter on mobile devices.

摘要

文本分类在许多实际应用中起着重要作用。在现实世界中,存在着极小的数据集。大多数现有的方法采用预先训练的神经网络模型来处理这种数据集。然而,这些方法要么由于输出大小较大而难以在移动设备上部署,要么无法充分提取短语和子句之间的深层语义信息。本文提出了一种基于多模型的深度学习框架,用于处理极小规模和不平衡的短文本多分类问题。我们的框架主要包括五个层:编码器层、单词级 LSTM 网络层、句子级 LSTM 网络层、最大池化层和 SoftMax 层。编码器层使用 DistilBERT 获得上下文敏感的动态单词向量,这些向量在传统的特征工程方法中很难表示。由于该层的 Transformer 部分被蒸馏,因此我们的框架被压缩了。然后,我们使用接下来的两层来提取深层语义信息。编码器层的输出被发送到一个双向 LSTM 网络,通过单词和句子级别的 LSTM 分层提取特征矩阵,以获得更精细的语义表示。之后,最大池化层将特征矩阵转换为低维矩阵,仅保留明显的特征。最后,将特征矩阵作为全连接 SoftMax 层的输入,该层包含一个函数,可以将预测的线性向量转换为每个分类中文本的输出值作为概率。在两个公共基准上的广泛实验表明,我们的方法在极小数据集上是有效的。它在精度、召回率、准确性和 F1 得分方面保留了最先进的基线性能,并且通过模型大小、训练时间和收敛周期,我们可以得出结论,我们的方法可以更快、更轻地部署在移动设备上。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/917f/9560856/c655f3a56652/CIN2022-7183207.001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验