Cabral Eduardo Lobo Lustosa, Driemeier Larissa
Institute for Energy and Nuclear Research and Mauá Institute of Technology, São Paulo, SP, Brazil.
Department of Mechatronics and Mechanical Systems Engineering - Polytechnic School - University of São Paulo, São Paulo, SP, Brazil.
Neural Netw. 2025 Nov;191:107763. doi: 10.1016/j.neunet.2025.107763. Epub 2025 Jun 26.
The expanding scale of large neural network models introduces significant challenges, driving efforts to reduce memory usage and enhance computational efficiency. Such measures are crucial to ensure the practical implementation and effective application of these sophisticated models across a wide array of use cases. This study examines the impact of parameter bit precision on model performance compared to standard 32-bit models, with a focus on multiclass object classification in images. The models analyzed include those with fully connected layers, convolutional layers, and transformer blocks, with model weight resolution ranging from 1 bit to 4.08 bits. The findings indicate that models with lower parameter bit precision achieve results comparable to 32-bit models, showing promise for use in memory-constrained devices. While low-resolution models with a small number of parameters require more training epochs to achieve accuracy comparable to 32-bit models, those with a large number of parameters achieve similar performance within the same number of epochs. Additionally, data augmentation can destabilize training in low-resolution models, but including zero as a potential value in the weight parameters helps maintain stability and prevents performance degradation. Overall, 2.32-bit weights offer the optimal balance of memory reduction, performance, and efficiency. However, further research should explore other dataset types and more complex and larger models. These findings suggest a potential new era for optimized neural network models with reduced memory requirements and improved computational efficiency, though advancements in dedicated hardware are necessary to fully realize this potential.
大型神经网络模型规模的不断扩大带来了重大挑战,促使人们努力减少内存使用并提高计算效率。这些措施对于确保这些复杂模型在广泛的用例中的实际实施和有效应用至关重要。本研究考察了与标准32位模型相比,参数位精度对模型性能的影响,重点是图像中的多类对象分类。所分析的模型包括具有全连接层、卷积层和Transformer块的模型,模型权重分辨率范围从1位到4.08位。研究结果表明,参数位精度较低的模型能够取得与32位模型相当的结果,显示出在内存受限设备中使用的潜力。虽然参数数量较少的低分辨率模型需要更多的训练轮次才能达到与32位模型相当的准确率,但参数数量较多的模型在相同的训练轮次内能够取得类似的性能。此外,数据增强可能会使低分辨率模型的训练不稳定,但在权重参数中包含零作为一个可能的值有助于保持稳定性并防止性能下降。总体而言,2.32位权重在内存减少、性能和效率之间提供了最佳平衡。然而,进一步的研究应探索其他数据集类型以及更复杂、更大的模型。这些发现预示着一个优化神经网络模型的新时代的潜在到来,其内存需求降低,计算效率提高,不过要充分实现这一潜力,专用硬件的进步是必要的。