WI-TMLEGA：基于熵增益和学习率调整的权重初始化与训练方法

WI-TMLEGA: Weight Initialization and Training Method Based on Entropy Gain and Learning Rate Adjustment.

作者信息

Tang Hongchuan, Li Zhongguo, Wang Qi, Fan Wenbin

机构信息

School of Mechanical Engineering, Jiangsu University of Science and Technology, Zhenjiang 212100, China.

School of Automotive Engineering, Nantong Institute of Technology, Nantong 226001, China.

出版信息

Entropy (Basel). 2024 Jul 23;26(8):614. doi: 10.3390/e26080614.

DOI:10.3390/e26080614

PMID:39202084

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11353430/

Abstract

Addressing the issues of prolonged training times and low recognition rates in large model applications, this paper proposes a weight training method based on entropy gain for weight initialization and dynamic adjustment of the learning rate using the multilayer perceptron (MLP) model as an example. Initially, entropy gain was used to replace random initial values for weight initialization. Subsequently, an incremental learning rate strategy was employed for weight updates. The model was trained and validated using the MNIST handwritten digit dataset. The experimental results showed that, compared to random initialization, the proposed initialization method improves training effectiveness by 39.8% and increases the maximum recognition accuracy by 8.9%, demonstrating the feasibility of this method in large model applications.

摘要

针对大模型应用中训练时间长和识别率低的问题，本文以多层感知器（MLP）模型为例，提出一种基于熵增益的权重训练方法，用于权重初始化和学习率的动态调整。首先，使用熵增益代替随机初始值进行权重初始化。随后，采用增量学习率策略进行权重更新。使用MNIST手写数字数据集对模型进行训练和验证。实验结果表明，与随机初始化相比，所提出的初始化方法将训练效率提高了39.8%，最大识别准确率提高了8.9%，证明了该方法在大模型应用中的可行性。