• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用挤压胶囊进行短文本分类的多级语义提取研究

Investigating Multi-Level Semantic Extraction with Squash Capsules for Short Text Classification.

作者信息

Li Jing, Zhang Dezheng, Wulamu Aziguli

机构信息

School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China.

Beijing Key Laboratory of Knowledge Engineering for Materials Science, University of Science and Technology Beijing, Beijing 100083, China.

出版信息

Entropy (Basel). 2022 Apr 23;24(5):590. doi: 10.3390/e24050590.

DOI:10.3390/e24050590
PMID:35626475
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9141385/
Abstract

At present, short text classification is a hot topic in the area of natural language processing. Due to the sparseness and irregularity of short text, the task of short text classification still faces great challenges. In this paper, we propose a new classification model from the aspects of short text representation, global feature extraction and local feature extraction. We use convolutional networks to extract shallow features from short text vectorization, and introduce a multi-level semantic extraction framework. It uses BiLSTM as the encoding layer while the attention mechanism and normalization are used as the interaction layer. Finally, we concatenate the convolution feature vector and semantic results of the semantic framework. After several rounds of feature integration, the framework improves the quality of the feature representation. Combined with the capsule network, we obtain high-level local information by dynamic routing and then squash them. In addition, we explore the optimal depth of semantic feature extraction for short text based on a multi-level semantic framework. We utilized four benchmark datasets to demonstrate that our model provides comparable results. The experimental results show that the accuracy of SUBJ, TREC, MR and ProcCons are 93.8%, 91.94%, 82.81% and 98.43%, respectively, which verifies that our model has greatly improves classification accuracy and model robustness.

摘要

目前,短文本分类是自然语言处理领域的一个热门话题。由于短文本的稀疏性和不规则性,短文本分类任务仍然面临巨大挑战。在本文中,我们从短文本表示、全局特征提取和局部特征提取等方面提出了一种新的分类模型。我们使用卷积网络从短文本向量化中提取浅层特征,并引入了一个多层次语义提取框架。它使用双向长短期记忆网络(BiLSTM)作为编码层,同时使用注意力机制和归一化作为交互层。最后,我们将卷积特征向量和语义框架的语义结果连接起来。经过几轮特征整合,该框架提高了特征表示的质量。结合胶囊网络,我们通过动态路由获得高级局部信息,然后对其进行压缩。此外,我们基于多层次语义框架探索了短文本语义特征提取的最佳深度。我们利用四个基准数据集来证明我们的模型提供了可比的结果。实验结果表明,SUBJ、TREC、MR和ProcCons的准确率分别为93.8%、91.94%、82.81%和98.43%,这验证了我们的模型大大提高了分类准确率和模型鲁棒性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be27/9141385/dade13994496/entropy-24-00590-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be27/9141385/b87b4dca6eb1/entropy-24-00590-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be27/9141385/3d3b2de99087/entropy-24-00590-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be27/9141385/ba66d4cdfdaa/entropy-24-00590-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be27/9141385/ce35e4c3d30e/entropy-24-00590-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be27/9141385/a0be0d37bcdb/entropy-24-00590-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be27/9141385/dade13994496/entropy-24-00590-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be27/9141385/b87b4dca6eb1/entropy-24-00590-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be27/9141385/3d3b2de99087/entropy-24-00590-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be27/9141385/ba66d4cdfdaa/entropy-24-00590-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be27/9141385/ce35e4c3d30e/entropy-24-00590-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be27/9141385/a0be0d37bcdb/entropy-24-00590-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be27/9141385/dade13994496/entropy-24-00590-g006.jpg

相似文献

1
Investigating Multi-Level Semantic Extraction with Squash Capsules for Short Text Classification.使用挤压胶囊进行短文本分类的多级语义提取研究
Entropy (Basel). 2022 Apr 23;24(5):590. doi: 10.3390/e24050590.
2
DCCL: Dual-channel hybrid neural network combined with self-attention for text classification.DCCL:双通道混合神经网络与自注意力相结合的文本分类方法。
Math Biosci Eng. 2023 Jan;20(2):1981-1992. doi: 10.3934/mbe.2023091. Epub 2022 Nov 9.
3
Construction and Research on Chinese Semantic Mapping Based on Linguistic Features and Sparse Self-Learning Neural Networks.基于语言特征和稀疏自学习神经网络的中文语义映射构建与研究。
Comput Intell Neurosci. 2022 Jun 20;2022:2315802. doi: 10.1155/2022/2315802. eCollection 2022.
4
Author identification of literary works based on text analysis and deep learning.基于文本分析和深度学习的文学作品作者身份识别。
Heliyon. 2024 Jan 29;10(3):e25464. doi: 10.1016/j.heliyon.2024.e25464. eCollection 2024 Feb 15.
5
A Multimodel-Based Deep Learning Framework for Short Text Multiclass Classification with the Imbalanced and Extremely Small Data Set.基于多模型的深度学习框架,用于处理不平衡且超小规模数据集的短文本多分类问题。
Comput Intell Neurosci. 2022 Oct 6;2022:7183207. doi: 10.1155/2022/7183207. eCollection 2022.
6
CapsTM: capsule network for Chinese medical text matching.CapsTM:用于中文医疗文本匹配的胶囊网络。
BMC Med Inform Decis Mak. 2021 Jul 30;21(Suppl 2):94. doi: 10.1186/s12911-021-01442-9.
7
Attention enhanced capsule network for text classification by encoding syntactic dependency trees with graph convolutional neural network.通过使用图卷积神经网络对句法依存树进行编码的注意力增强胶囊网络用于文本分类。
PeerJ Comput Sci. 2022 Jan 5;8:e831. doi: 10.7717/peerj-cs.831. eCollection 2022.
8
Text Matching in Insurance Question-Answering Community Based on an Integrated BiLSTM-TextCNN Model Fusing Multi-Feature.基于融合多特征的集成双向长短期记忆网络-文本卷积神经网络模型的保险问答社区文本匹配
Entropy (Basel). 2023 Apr 10;25(4):639. doi: 10.3390/e25040639.
9
Chinese text classification method based on sentence information enhancement and feature fusion.基于句子信息增强与特征融合的中文文本分类方法
Heliyon. 2024 Aug 24;10(17):e36861. doi: 10.1016/j.heliyon.2024.e36861. eCollection 2024 Sep 15.
10
Spanish Emotion Recognition Method Based on Cross-Cultural Perspective.基于跨文化视角的西班牙语情感识别方法
Front Psychol. 2022 May 31;13:849083. doi: 10.3389/fpsyg.2022.849083. eCollection 2022.

引用本文的文献

1
Research on performance variations of classifiers with the influence of pre-processing methods for Chinese short text classification.中文短文本分类中预处理方法对分类器性能变化的影响研究。
PLoS One. 2023 Oct 12;18(10):e0292582. doi: 10.1371/journal.pone.0292582. eCollection 2023.

本文引用的文献

1
Attention enhanced capsule network for text classification by encoding syntactic dependency trees with graph convolutional neural network.通过使用图卷积神经网络对句法依存树进行编码的注意力增强胶囊网络用于文本分类。
PeerJ Comput Sci. 2022 Jan 5;8:e831. doi: 10.7717/peerj-cs.831. eCollection 2022.
2
Short Text Paraphrase Identification Model Based on RDN-MESIM.基于 RDN-MESIM 的短文释义识别模型。
Comput Intell Neurosci. 2021 Sep 5;2021:6865287. doi: 10.1155/2021/6865287. eCollection 2021.