Kleyko Denis, Rosato Antonello, Frady Edward Paxon, Panella Massimo, Sommer Friedrich T
IEEE Trans Neural Netw Learn Syst. 2024 Jul;35(7):9885-9899. doi: 10.1109/TNNLS.2023.3237381. Epub 2024 Jul 10.
Multilayer neural networks set the current state of the art for many technical classification problems. But, these networks are still, essentially, black boxes in terms of analyzing them and predicting their performance. Here, we develop a statistical theory for the one-layer perceptron and show that it can predict performances of a surprisingly large variety of neural networks with different architectures. A general theory of classification with perceptrons is developed by generalizing an existing theory for analyzing reservoir computing models and connectionist models for symbolic reasoning known as vector symbolic architectures. Our statistical theory offers three formulas leveraging the signal statistics with increasing detail. The formulas are analytically intractable, but can be evaluated numerically. The description level that captures maximum details requires stochastic sampling methods. Depending on the network model, the simpler formulas already yield high prediction accuracy. The quality of the theory predictions is assessed in three experimental settings, a memorization task for echo state networks (ESNs) from reservoir computing literature, a collection of classification datasets for shallow randomly connected networks, and the ImageNet dataset for deep convolutional neural networks. We find that the second description level of the perceptron theory can predict the performance of types of ESNs, which could not be described previously. Furthermore, the theory can predict deep multilayer neural networks by being applied to their output layer. While other methods for prediction of neural networks performance commonly require to train an estimator model, the proposed theory requires only the first two moments of the distribution of the postsynaptic sums in the output neurons. Moreover, the perceptron theory compares favorably to other methods that do not rely on training an estimator model.
多层神经网络在许多技术分类问题上代表了当前的先进水平。但是,就分析这些网络并预测其性能而言,它们本质上仍然是黑箱。在此,我们为单层感知机开发了一种统计理论,并表明它可以预测具有不同架构的各种神经网络的性能。通过推广一种现有的理论来分析水库计算模型和用于符号推理的连接主义模型(称为向量符号架构),从而开发了一种感知机分类的通用理论。我们的统计理论提供了三个利用信号统计且细节不断增加的公式。这些公式在解析上难以处理,但可以进行数值评估。捕获最大细节的描述级别需要随机采样方法。根据网络模型,更简单的公式已经能产生较高的预测精度。在三种实验设置中评估了理论预测的质量,一种是来自水库计算文献的回声状态网络(ESN)的记忆任务,一种是用于浅层随机连接网络的分类数据集集合,还有一种是用于深度卷积神经网络的ImageNet数据集。我们发现感知机理论的第二个描述级别可以预测以前无法描述的ESN类型的性能。此外,该理论通过应用于深度多层神经网络的输出层来预测它们。虽然其他预测神经网络性能的方法通常需要训练一个估计模型,但所提出的理论只需要输出神经元中突触后总和分布的前两个矩。而且,感知机理论与其他不依赖于训练估计模型的方法相比具有优势。