Suppr超能文献

通过解析树嵌入提升文本生成。

Enhancing Text Generation via Parse Tree Embedding.

机构信息

School of International Economics and Management, Beijing Technology and Business University, Beijing 100048, China.

School of Computer Science and Engineering, Beijing Technology and Business University, Beijing 100048, China.

出版信息

Comput Intell Neurosci. 2022 Jun 10;2022:4096383. doi: 10.1155/2022/4096383. eCollection 2022.

Abstract

Natural language generation (NLG) is a core component of machine translation, dialogue systems, speech recognition, summarization, and so forth. The existing text generation methods tend to be based on recurrent neural language models (NLMs), which generate sentences from encoding vector. However, most of these models lack explicit structured representation for text generation. In this work, we introduce a new generative model for NLG, called Tree-VAE. First it samples a sentence from the training corpus and then generates a new sentence based on the corresponding parse tree embedding vector. Tree-LSTM is used in collaboration with the Stanford Parser to retrieve sentence construction data, which is then used to train a conditional discretization autoencoder generator based on the embeddings of sentence patterns. The proposed model is extensively evaluated on three different datasets. The experimental results proved that the proposed model can generate substantially more diverse and coherent text than existing baseline methods.

摘要

自然语言生成(NLG)是机器翻译、对话系统、语音识别、摘要等的核心组成部分。现有的文本生成方法往往基于循环神经网络语言模型(NLMs),从编码向量生成句子。然而,这些模型大多缺乏文本生成的显式结构表示。在这项工作中,我们引入了一种新的自然语言生成生成模型,称为 Tree-VAE。它首先从训练语料库中采样一个句子,然后根据相应的解析树嵌入向量生成一个新句子。Tree-LSTM 与斯坦福解析器合作,检索句子构建数据,然后使用该数据基于句子模式的嵌入训练基于条件离散化自动编码器生成器。在三个不同的数据集上对所提出的模型进行了广泛的评估。实验结果证明,与现有的基线方法相比,所提出的模型可以生成更加多样化和连贯的文本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5202/9205699/3ecb9d22606c/CIN2022-4096383.001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验