Suppr超能文献

基于广义莱默尔和幂均值的超灵活卷积神经网络。

Hyper-flexible Convolutional Neural Networks based on Generalized Lehmer and Power Means.

机构信息

Faculty of Information Technology, University of Jyväskylä, Finland.

Department of Artificial Intelligence, Kharkiv National University of Radio Electronics, Ukraine.

出版信息

Neural Netw. 2022 Nov;155:177-203. doi: 10.1016/j.neunet.2022.08.017. Epub 2022 Aug 23.

Abstract

Convolutional Neural Network is one of the famous members of the deep learning family of neural network architectures, which is used for many purposes, including image classification. In spite of the wide adoption, such networks are known to be highly tuned to the training data (samples representing a particular problem), and they are poorly reusable to address new problems. One way to change this would be, in addition to trainable weights, to apply trainable parameters of the mathematical functions, which simulate various neural computations within such networks. In this way, we may distinguish between the narrowly focused task-specific parameters (weights) and more generic capability-specific parameters. In this paper, we suggest a couple of flexible mathematical functions (Generalized Lehmer Mean and Generalized Power Mean) with trainable parameters to replace some fixed operations (such as ordinary arithmetic mean or simple weighted aggregation), which are traditionally used within various components of a convolutional neural network architecture. We named the overall architecture with such an update as a hyper-flexible convolutional neural network. We provide mathematical justification of various components of such architecture and experimentally show that it performs better than the traditional one, including better robustness regarding the adversarial perturbations of testing data.

摘要

卷积神经网络是深度学习家族的神经网络架构之一,它被广泛应用于多种目的,包括图像分类。尽管已经广泛应用,但这些网络被认为高度依赖于训练数据(代表特定问题的样本),并且难以重新用于解决新问题。改变这种情况的一种方法是,除了可训练的权重外,还应用可训练的数学函数参数,这些参数模拟了这些网络内部的各种神经计算。通过这种方式,我们可以区分专门针对特定任务的参数(权重)和更通用的特定于功能的参数。在本文中,我们建议使用几个具有可训练参数的灵活数学函数(广义勒美均值和广义幂均值)来替代传统上在卷积神经网络架构的各个组件中使用的固定操作(例如普通算术平均值或简单加权聚合)。我们将具有这种更新的整体架构命名为超灵活卷积神经网络。我们提供了这种架构的各个组件的数学证明,并通过实验表明,它比传统的卷积神经网络表现更好,包括对测试数据的对抗性扰动具有更好的鲁棒性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验