宽深神经网络实现分类一致性。

Wide and deep neural networks achieve consistency for classification.

机构信息

Laboratory for Information & Decision Systems, Massachusetts Institute of Technology, Cambridge, MA 02142.

Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, MA 02142.

出版信息

Proc Natl Acad Sci U S A. 2023 Apr 4;120(14):e2208779120. doi: 10.1073/pnas.2208779120. Epub 2023 Mar 30.

DOI:10.1073/pnas.2208779120

PMID:36996114

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10083596/

Abstract

While neural networks are used for classification tasks across domains, a long-standing open problem in machine learning is determining whether neural networks trained using standard procedures are consistent for classification, i.e., whether such models minimize the probability of misclassification for arbitrary data distributions. In this work, we identify and construct an explicit set of neural network classifiers that are consistent. Since effective neural networks in practice are typically both wide and deep, we analyze infinitely wide networks that are also infinitely deep. In particular, using the recent connection between infinitely wide neural networks and neural tangent kernels, we provide explicit activation functions that can be used to construct networks that achieve consistency. Interestingly, these activation functions are simple and easy to implement, yet differ from commonly used activations such as ReLU or sigmoid. More generally, we create a taxonomy of infinitely wide and deep networks and show that these models implement one of three well-known classifiers depending on the activation function used: 1) 1-nearest neighbor (model predictions are given by the label of the nearest training example); 2) majority vote (model predictions are given by the label of the class with the greatest representation in the training set); or 3) singular kernel classifiers (a set of classifiers containing those that achieve consistency). Our results highlight the benefit of using deep networks for classification tasks, in contrast to regression tasks, where excessive depth is harmful.

摘要

虽然神经网络被用于各个领域的分类任务，但机器学习中的一个长期存在的开放性问题是确定使用标准程序训练的神经网络是否在分类上是一致的，即这些模型是否最小化了任意数据分布的分类错误概率。在这项工作中，我们确定并构建了一组明确的一致的神经网络分类器。由于实际中有效的神经网络通常既宽又深，我们分析了无限宽且无限深的网络。具体来说，利用最近无限宽神经网络与神经切核之间的联系，我们提供了可以用于构建实现一致性的网络的显式激活函数。有趣的是，这些激活函数简单易用，但与常用的激活函数（如 ReLU 或 sigmoid）不同。更一般地，我们创建了无限宽和深的网络分类法，并表明这些模型根据使用的激活函数实现了三种著名分类器中的一种：1）最近邻（模型预测由最近的训练示例的标签给出）；2）多数投票（模型预测由训练集中代表性最大的类别的标签给出）；或 3）奇异核分类器（包含实现一致性的那些分类器的集合）。我们的结果强调了在分类任务中使用深度网络的好处，与回归任务形成对比，在回归任务中，深度过大是有害的。

相似文献

Wide and deep neural networks achieve consistency for classification.

Proc Natl Acad Sci U S A. 2023 Apr 4;120(14):e2208779120. doi: 10.1073/pnas.2208779120. Epub 2023 Mar 30.

The deep arbitrary polynomial chaos neural network or how Deep Artificial Neural Networks could benefit from data-driven homogeneous chaos theory.

Neural Netw. 2023 Sep;166:85-104. doi: 10.1016/j.neunet.2023.06.036. Epub 2023 Jul 10.

MRI-Based Brain Tumor Classification Using Ensemble of Deep Features and Machine Learning Classifiers.

Sensors (Basel). 2021 Mar 22;21(6):2222. doi: 10.3390/s21062222.

Optimizing neural networks for medical data sets: A case study on neonatal apnea prediction.

Artif Intell Med. 2019 Jul;98:59-76. doi: 10.1016/j.artmed.2019.07.008. Epub 2019 Jul 25.

Study of the Application of Deep Convolutional Neural Networks (CNNs) in Processing Sensor Data and Biomedical Images.

Sensors (Basel). 2019 Aug 17;19(16):3584. doi: 10.3390/s19163584.

Spectral bias and task-model alignment explain generalization in kernel regression and infinitely wide neural networks.

Nat Commun. 2021 May 18;12(1):2914. doi: 10.1038/s41467-021-23103-1.

Simultaneous approximation of a smooth function and its derivatives by deep neural networks with piecewise-polynomial activations.

Neural Netw. 2023 Apr;161:242-253. doi: 10.1016/j.neunet.2023.01.035. Epub 2023 Feb 2.

Neural networks with ReLU powers need less depth.

Neural Netw. 2024 Apr;172:106073. doi: 10.1016/j.neunet.2023.12.027. Epub 2023 Dec 19.

fMRI volume classification using a 3D convolutional neural network robust to shifted and scaled neuronal activations.

Neuroimage. 2020 Dec;223:117328. doi: 10.1016/j.neuroimage.2020.117328. Epub 2020 Sep 5.

The role of unpaired image-to-image translation for stain color normalization in colorectal cancer histology classification.

Comput Methods Programs Biomed. 2023 Jun;234:107511. doi: 10.1016/j.cmpb.2023.107511. Epub 2023 Mar 26.

引用本文的文献

Should Artificial Intelligence Play a Durable Role in Biomedical Research and Practice?

Int J Mol Sci. 2024 Dec 13;25(24):13371. doi: 10.3390/ijms252413371.

Optimizing Image Enhancement: Feature Engineering for Improved Classification in AI-Assisted Artificial Retinas.

Sensors (Basel). 2024 Apr 23;24(9):2678. doi: 10.3390/s24092678.

本文引用的文献

Simple, fast, and flexible framework for matrix completion with infinite width neural networks.

Proc Natl Acad Sci U S A. 2022 Apr 19;119(16):e2115064119. doi: 10.1073/pnas.2115064119. Epub 2022 Apr 11.

Highly accurate protein structure prediction for the human proteome.

Nature. 2021 Aug;596(7873):590-596. doi: 10.1038/s41586-021-03828-1. Epub 2021 Jul 22.

Reconciling modern machine-learning practice and the classical bias-variance trade-off.

Proc Natl Acad Sci U S A. 2019 Aug 6;116(32):15849-15854. doi: 10.1073/pnas.1903070116. Epub 2019 Jul 24.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

宽深神经网络实现分类一致性。

Wide and deep neural networks achieve consistency for classification.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献