Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
Department of Computer Science, Technical University of Munich, Germany.
Neural Netw. 2019 Oct;118:247-261. doi: 10.1016/j.neunet.2019.06.014. Epub 2019 Jul 8.
Text classification has been attracting increasing attention with the growth of textual data created on the Internet. Great progress has been made by deep neural networks for domains where a large amount of labeled training data is available. However, providing sufficient data is time-consuming and labor-intensive, establishing substantial obstacles for expanding the learned models to new domains or new tasks. In this paper, we investigate the transferring capability of capsule networks for text classification. Capsule networks are able to capture the intrinsic spatial part-whole relationship constituting domain invariant knowledge that bridges the knowledge gap between the source and target domains (or tasks). We propose an iterative adaptation strategy for cross-domain text classification, which adapts the source domain to the target domain. A fast training method with capsule compression and class-guided routing is designed to make the capsule network more efficient in computation for cross-domain text classification. We first conduct experiments to evaluate the performance of the capsule network on six benchmark datasets for generic text classification. The capsule networks outperform the compared models on 4 out of 6 datasets, suggesting the effectiveness of the capsule networks for text classification. More importantly, we demonstrate the transferring capability of the proposed cross-domain capsule network (TL-Capsule) by applying it to two transfer learning applications: single-label to multi-label text classification and cross-domain sentiment classification. The experimental results show that capsule networks consistently and substantially outperform the compared methods for both tasks. To the best of our knowledge, this is the first work that empirically investigates the transferring capability of capsule networks for text modeling.
随着互联网上创建的文本数据的增长,文本分类越来越受到关注。在有大量标记训练数据的领域,深度神经网络已经取得了巨大的进展。然而,提供足够的数据既耗时又费力,为将学习到的模型扩展到新的领域或新的任务建立了实质性的障碍。在本文中,我们研究了胶囊网络在文本分类中的迁移能力。胶囊网络能够捕捉构成域不变知识的内在空间部分-整体关系,从而弥合源域和目标域(或任务)之间的知识差距。我们提出了一种用于跨域文本分类的迭代自适应策略,该策略自适应地将源域适应到目标域。设计了一种带有胶囊压缩和类指导路由的快速训练方法,以使胶囊网络在跨域文本分类中的计算更加高效。我们首先在六个基准文本分类数据集上进行实验,以评估胶囊网络的性能。胶囊网络在 6 个数据集的 4 个数据集上的表现优于比较模型,这表明胶囊网络在文本分类方面的有效性。更重要的是,我们通过将其应用于两个迁移学习应用程序:单标签到多标签文本分类和跨域情感分类,来证明所提出的跨域胶囊网络(TL-Capsule)的迁移能力。实验结果表明,胶囊网络在这两个任务上都一致且显著优于比较方法。据我们所知,这是第一个对胶囊网络在文本建模中的迁移能力进行实证研究的工作。