从在混沌符号序列上训练的循环神经网络中提取有限状态表示。

Extracting finite-state representations from recurrent neural networks trained on chaotic symbolic sequences.

作者信息

Tino P, Köteles M

机构信息

Department of Computer Science and Engineering, Slovak Technical University, Ilkovicova 3, 812 19 Bratislava, Slovakia.

出版信息

IEEE Trans Neural Netw. 1999;10(2):284-302. doi: 10.1109/72.750555.

DOI:10.1109/72.750555

PMID:18252527

Abstract

While much work has been done in neural-based modeling of real-valued chaotic time series, little effort has been devoted to address similar problems in the symbolic domain. We investigate the knowledge induction process associated with training recurrent neural networks (RNN's) on single long chaotic symbolic sequences. Even though training RNN's to predict the next symbol leaves the standard performance measures such as the mean square error on the network output virtually unchanged, the networks nevertheless do extract a lot of knowledge. We monitor the knowledge extraction process by considering the networks stochastic sources and letting them generate sequences which are then confronted with the training sequence via information theoretic entropy and cross-entropy measures. We also study the possibility of reformulating the knowledge gained by RNN's in a compact and easy-to-analyze form of finite-state stochastic machines. The experiments are performed on two sequences with different "complexities" measured by the size and state transition structure of the induced Crutchfield's epsilon-machines. We find that, with respect to the original RNN's, the extracted machines can achieve comparable or even better entropy and cross-entropy performance. Moreover, RNN's reflect the training sequence complexity in their dynamical state representations that can in turn be reformulated using finite-state means. Our findings are confirmed by a much more detailed analysis of model generated sequences through the statistical mechanical metaphor of entropy spectra. We also introduce a visual representation of allowed block structure in the studied sequences that, besides having nice theoretical properties, allows on the topological level for an illustrative insight into both RNN training and finite-state stochastic machine extraction processes.

摘要

虽然在基于神经网络的实值混沌时间序列建模方面已经做了很多工作，但在符号领域解决类似问题的努力却很少。我们研究了与在单个长混沌符号序列上训练递归神经网络（RNN）相关的知识归纳过程。尽管训练RNN来预测下一个符号会使诸如网络输出上的均方误差等标准性能指标几乎保持不变，但网络仍然能够提取大量知识。我们通过考虑网络的随机源并让它们生成序列来监测知识提取过程，然后通过信息论熵和交叉熵度量将这些序列与训练序列进行对比。我们还研究了以有限状态随机机的紧凑且易于分析的形式重新表述RNN所获得知识的可能性。实验是在两个具有不同“复杂度”的序列上进行的，这两个序列的复杂度是通过诱导的克鲁奇菲尔德ε - 机的大小和状态转移结构来衡量的。我们发现，相对于原始的RNN，提取的机器能够实现相当甚至更好的熵和交叉熵性能。此外，RNN在其动态状态表示中反映了训练序列的复杂度，而这种表示又可以通过有限状态方法进行重新表述。通过熵谱的统计力学隐喻对模型生成序列进行更详细的分析，证实了我们的发现。我们还引入了所研究序列中允许的块结构的可视化表示，它除了具有良好的理论特性外，还能在拓扑层面上对RNN训练和有限状态随机机提取过程提供直观的洞察。