Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA.
Hearne Institute for Theoretical Physics and Department of Physics and Astronomy, Louisiana State University, Baton Rouge, Louisiana 70803, USA.
Phys Rev Lett. 2022 May 6;128(18):180505. doi: 10.1103/PhysRevLett.128.180505.
Several architectures have been proposed for quantum neural networks (QNNs), with the goal of efficiently performing machine learning tasks on quantum data. Rigorous scaling results are urgently needed for specific QNN constructions to understand which, if any, will be trainable at a large scale. Here, we analyze the gradient scaling (and hence the trainability) for a recently proposed architecture that we call dissipative QNNs (DQNNs), where the input qubits of each layer are discarded at the layer's output. We find that DQNNs can exhibit barren plateaus, i.e., gradients that vanish exponentially in the number of qubits. Moreover, we provide quantitative bounds on the scaling of the gradient for DQNNs under different conditions, such as different cost functions and circuit depths, and show that trainability is not always guaranteed. Our work represents the first rigorous analysis of the scalability of a perceptron-based QNN.
已经提出了几种量子神经网络 (QNN) 的架构,其目标是在量子数据上有效地执行机器学习任务。为了了解哪些架构可以在大规模上进行训练,迫切需要针对特定的 QNN 结构进行严格的扩展结果。在这里,我们分析了最近提出的一种架构的梯度扩展(因此可训练性),我们称之为耗散 QNN(DQNN),其中每层的输入量子位在层的输出时被丢弃。我们发现 DQNN 可能表现出贫瘠的高原,即梯度在量子位数量上呈指数衰减。此外,我们在不同条件下(例如不同的成本函数和电路深度)为 DQNN 的梯度扩展提供了定量界限,并表明可训练性并不总是有保证的。我们的工作代表了对基于感知器的 QNN 的可扩展性的首次严格分析。