Suppr超能文献

利用符号知识库和模式理论进行常识自然语言推理。

Leveraging Symbolic Knowledge Bases for Commonsense Natural Language Inference Using Pattern Theory.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):13185-13202. doi: 10.1109/TPAMI.2023.3287837. Epub 2023 Oct 3.

Abstract

The commonsense natural language inference (CNLI) tasks aim to select the most likely follow-up statement to a contextual description of ordinary, everyday events and facts. Current approaches to transfer learning of CNLI models across tasks require many labeled data from the new task. This paper presents a way to reduce this need for additional annotated training data from the new task by leveraging symbolic knowledge bases, such as ConceptNet. We formulate a teacher-student framework for mixed symbolic-neural reasoning, with the large-scale symbolic knowledge base serving as the teacher and a trained CNLI model as the student. This hybrid distillation process involves two steps. The first step is a symbolic reasoning process. Given a collection of unlabeled data, we use an abductive reasoning framework based on Grenander's pattern theory to create weakly labeled data. Pattern theory is an energy-based graphical probabilistic framework for reasoning among random variables with varying dependency structures. In the second step, the weakly labeled data, along with a fraction of the labeled data, is used to transfer-learn the CNLI model into the new task. The goal is to reduce the fraction of labeled data required. We demonstrate the efficacy of our approach by using three publicly available datasets (OpenBookQA, SWAG, and HellaSWAG) and evaluating three CNLI models (BERT, LSTM, and ESIM) that represent different tasks. We show that, on average, we achieve 63% of the top performance of a fully supervised BERT model with no labeled data. With only 1,000 labeled samples, we can improve this performance to 72%. Interestingly, without training, the teacher mechanism itself has significant inference power. The pattern theory framework achieves 32.7% accuracy on OpenBookQA, outperforming transformer-based models such as GPT (26.6%), GPT-2 (30.2%), and BERT (27.1%) by a significant margin. We demonstrate that the framework can be generalized to successfully train neural CNLI models using knowledge distillation under unsupervised and semi-supervised learning settings. Our results show that it outperforms all unsupervised and weakly supervised baselines and some early supervised approaches, while offering competitive performance with fully supervised baselines. Additionally, we show that the abductive learning framework can be adapted for other downstream tasks, such as unsupervised semantic textual similarity, unsupervised sentiment classification, and zero-shot text classification, without significant modification to the framework. Finally, user studies show that the generated interpretations enhance its explainability by providing key insights into its reasoning mechanism.

摘要

常识自然语言推理 (CNLI) 任务旨在选择最可能的后续语句来描述普通的日常事件和事实。当前,跨任务转移学习 CNLI 模型的方法需要来自新任务的大量标记数据。本文提出了一种通过利用符号知识库(如 ConceptNet)来减少对新任务的额外注释训练数据的需求的方法。我们提出了一种混合符号-神经推理的师生框架,其中大规模的符号知识库作为教师,经过训练的 CNLI 模型作为学生。这个混合式蒸馏过程涉及两个步骤。第一步是符号推理过程。给定一组未标记的数据,我们使用基于 Grenander 模式理论的溯因推理框架来创建弱标记数据。模式理论是一种基于能量的图形概率框架,用于推理具有不同依赖结构的随机变量。第二步,使用弱标记数据和一部分标记数据将 CNLI 模型转移到新任务中。目标是减少所需标记数据的比例。我们使用三个公开可用的数据集(OpenBookQA、SWAG 和 HellaSWAG)和三个代表不同任务的 CNLI 模型(BERT、LSTM 和 ESIM)来验证我们的方法的有效性。结果表明,在没有标记数据的情况下,我们平均可以达到完全监督的 BERT 模型的 63%的最高性能。只需 1000 个标记样本,我们就可以将性能提高到 72%。有趣的是,没有训练,教师机制本身就具有重要的推理能力。模式理论框架在 OpenBookQA 上的准确率达到 32.7%,明显优于基于转换器的模型,如 GPT(26.6%)、GPT-2(30.2%)和 BERT(27.1%)。我们证明,该框架可以在无监督和半监督学习环境下使用知识蒸馏成功训练神经 CNLI 模型。我们的结果表明,它优于所有无监督和弱监督基线以及一些早期的监督方法,同时与完全监督基线具有竞争力。此外,我们还表明,溯因学习框架可以适应其他下游任务,如无监督语义文本相似性、无监督情感分类和零样本文本分类,而无需对框架进行重大修改。最后,用户研究表明,生成的解释通过提供对其推理机制的关键见解,增强了其可解释性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验