Suppr超能文献

WI-TMLEGA:基于熵增益和学习率调整的权重初始化与训练方法

WI-TMLEGA: Weight Initialization and Training Method Based on Entropy Gain and Learning Rate Adjustment.

作者信息

Tang Hongchuan, Li Zhongguo, Wang Qi, Fan Wenbin

机构信息

School of Mechanical Engineering, Jiangsu University of Science and Technology, Zhenjiang 212100, China.

School of Automotive Engineering, Nantong Institute of Technology, Nantong 226001, China.

出版信息

Entropy (Basel). 2024 Jul 23;26(8):614. doi: 10.3390/e26080614.

Abstract

Addressing the issues of prolonged training times and low recognition rates in large model applications, this paper proposes a weight training method based on entropy gain for weight initialization and dynamic adjustment of the learning rate using the multilayer perceptron (MLP) model as an example. Initially, entropy gain was used to replace random initial values for weight initialization. Subsequently, an incremental learning rate strategy was employed for weight updates. The model was trained and validated using the MNIST handwritten digit dataset. The experimental results showed that, compared to random initialization, the proposed initialization method improves training effectiveness by 39.8% and increases the maximum recognition accuracy by 8.9%, demonstrating the feasibility of this method in large model applications.

摘要

针对大模型应用中训练时间长和识别率低的问题,本文以多层感知器(MLP)模型为例,提出一种基于熵增益的权重训练方法,用于权重初始化和学习率的动态调整。首先,使用熵增益代替随机初始值进行权重初始化。随后,采用增量学习率策略进行权重更新。使用MNIST手写数字数据集对模型进行训练和验证。实验结果表明,与随机初始化相比,所提出的初始化方法将训练效率提高了39.8%,最大识别准确率提高了8.9%,证明了该方法在大模型应用中的可行性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce7f/11353430/19e66eaad46d/entropy-26-00614-g001.jpg

相似文献

2
High-order and multilayer perceptron initialization.
IEEE Trans Neural Netw. 1997;8(2):349-59. doi: 10.1109/72.557673.
8
Feedforward neural networks initialization based on discriminant learning.基于判别学习的前馈神经网络初始化。
Neural Netw. 2022 Feb;146:220-229. doi: 10.1016/j.neunet.2021.11.020. Epub 2021 Nov 25.

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验