Department of Mathematics and Statistics, Concordia University, Montreal, Quebec, H3G 1M8, Canada
Department of Mathematics, Simon Fraser University, Burnaby, British Columbia, V5A 1S6, Canada
Neural Comput. 2022 Jul 14;34(8):1756-1789. doi: 10.1162/neco_a_01510.
Often in language and other areas of cognition, whether two components of an object are identical or not determines if it is well formed. We call such constraints identity effects. When developing a system to learn well-formedness from examples, it is easy enough to build in an identity effect. But can identity effects be learned from the data without explicit guidance? We provide a framework in which we can rigorously prove that algorithms satisfying simple criteria cannot make the correct inference. We then show that a broad class of learning algorithms, including deep feedforward neural networks trained via gradient-based algorithms (such as stochastic gradient descent or the Adam method), satisfies our criteria, dependent on the encoding of inputs. In some broader circumstances, we are able to provide adversarial examples that the network necessarily classifies incorrectly. Finally, we demonstrate our theory with computational experiments in which we explore the effect of different input encodings on the ability of algorithms to generalize to novel inputs. This allows us to show similar effects to those predicted by theory for more realistic methods that violate some of the conditions of our theoretical results.
在语言和其他认知领域中,对象的两个组成部分是否相同决定了它是否具有良好的形式。我们将这种约束称为身份效应。在开发一个从示例中学习良好形式的系统时,很容易在其中构建身份效应。但是,身份效应可以从数据中学习而无需显式指导吗?我们提供了一个框架,可以在其中严格证明满足简单标准的算法不能做出正确的推断。然后,我们表明,包括通过基于梯度的算法(例如随机梯度下降或 Adam 方法)训练的深度前馈神经网络在内的广泛的学习算法满足我们的标准,这取决于输入的编码。在某些更广泛的情况下,我们能够提供对抗性示例,网络必然会将其错误分类。最后,我们通过计算实验证明了我们的理论,其中我们探索了不同输入编码对算法将新输入推广到新输入的能力的影响。这使我们能够展示与理论预测相似的效果,对于更现实的方法,这些方法违反了我们理论结果的某些条件。