Suppr超能文献

一种基于Transformer的分层变分自编码器联合隐马尔可夫模型用于长文本生成。

A Transformer-Based Hierarchical Variational AutoEncoder Combined Hidden Markov Model for Long Text Generation.

作者信息

Zhao Kun, Ding Hongwei, Ye Kai, Cui Xiaohui

机构信息

Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, China.

出版信息

Entropy (Basel). 2021 Sep 29;23(10):1277. doi: 10.3390/e23101277.

Abstract

The Variational AutoEncoder (VAE) has made significant progress in text generation, but it focused on short text (always a sentence). Long texts consist of multiple sentences. There is a particular relationship between each sentence, especially between the latent variables that control the generation of the sentences. The relationships between these latent variables help in generating continuous and logically connected long texts. There exist very few studies on the relationships between these latent variables. We proposed a method for combining the Transformer-Based Hierarchical Variational AutoEncoder and Hidden Markov Model (HT-HVAE) to learn multiple hierarchical latent variables and their relationships. This application improves long text generation. We use a hierarchical Transformer encoder to encode the long texts in order to obtain better hierarchical information of the long text. HT-HVAE's generation network uses HMM to learn the relationship between latent variables. We also proposed a method for calculating the perplexity for the multiple hierarchical latent variable structure. The experimental results show that our model is more effective in the dataset with strong logic, alleviates the notorious posterior collapse problem, and generates more continuous and logically connected long text.

摘要

变分自编码器(VAE)在文本生成方面取得了显著进展,但它专注于短文本(通常为一个句子)。长文本由多个句子组成。每个句子之间存在特定的关系,特别是在控制句子生成的潜在变量之间。这些潜在变量之间的关系有助于生成连贯且逻辑连贯的长文本。关于这些潜在变量之间的关系的研究非常少。我们提出了一种将基于Transformer的分层变分自编码器与隐马尔可夫模型(HT-HVAE)相结合的方法,以学习多个分层潜在变量及其关系。此应用改进了长文本生成。我们使用分层Transformer编码器对长文本进行编码,以便获得长文本更好的分层信息。HT-HVAE的生成网络使用HMM来学习潜在变量之间的关系。我们还提出了一种计算多个分层潜在变量结构的困惑度的方法。实验结果表明,我们的模型在逻辑较强的数据集中更有效,缓解了臭名昭著的后验坍塌问题,并生成了更连贯且逻辑连贯的长文本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d15d/8534582/96974f5deb54/entropy-23-01277-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验