Jahromi Saeed S, Orús Román
Department of Physics, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, 45137-66731, Iran.
Donostia International Physics Center, Paseo Manuel de Lardizabal 4, 20018, San Sebastián, Spain.
Sci Rep. 2024 Aug 16;14(1):19017. doi: 10.1038/s41598-024-69366-8.
Deep neural networks (NNs) encounter scalability limitations when confronted with a vast array of neurons, thereby constraining their achievable network depth. To address this challenge, we propose an integration of tensor networks (TN) into NN frameworks, combined with a variational DMRG-inspired training technique. This in turn, results in a scalable tensor neural network (TNN) architecture capable of efficient training over a large parameter space. Our variational algorithm utilizes a local gradient-descent technique, enabling manual or automatic computation of tensor gradients, facilitating design of hybrid TNN models with combined dense and tensor layers. Our training algorithm further provides insight on the entanglement structure of the tensorized trainable weights and correlation among the model parameters. We validate the accuracy and efficiency of our method by designing TNN models and providing benchmark results for linear and non-linear regressions, data classification and image recognition on MNIST handwritten digits.
当面对大量神经元时,深度神经网络(NNs)会遇到可扩展性限制,从而限制了它们能够达到的网络深度。为应对这一挑战,我们提出将张量网络(TN)集成到NN框架中,并结合一种受变分密度矩阵重整化群(DMRG)启发的训练技术。这进而产生了一种可扩展的张量神经网络(TNN)架构,能够在大参数空间上进行高效训练。我们的变分算法利用局部梯度下降技术,实现张量梯度的手动或自动计算,便于设计具有密集层和张量层组合的混合TNN模型。我们的训练算法还进一步揭示了张量化可训练权重的纠缠结构以及模型参数之间的相关性。我们通过设计TNN模型并提供线性和非线性回归、数据分类以及MNIST手写数字图像识别的基准结果,验证了我们方法的准确性和效率。