• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

LLaMA3量化的实证研究:从大语言模型到多模态大语言模型

An empirical study of LLaMA3 quantization: from LLMs to MLLMs.

作者信息

Huang Wei, Zheng Xingyu, Ma Xudong, Qin Haotong, Lv Chengtao, Chen Hong, Luo Jie, Qi Xiaojuan, Liu Xianglong, Magno Michele

机构信息

Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong, 999077 China.

School of Computer Science and Engineering, Beihang University, Xueyuan Road, Beijing, 100191 China.

出版信息

Vis Intell. 2024;2(1):36. doi: 10.1007/s44267-024-00070-x. Epub 2024 Dec 30.

DOI:10.1007/s44267-024-00070-x
PMID:39807379
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11728678/
Abstract

The LLaMA family, a collection of foundation language models ranging from 7B to 65B parameters, has become one of the most powerful open-source large language models (LLMs) and the popular LLM backbone of multi-modal large language models (MLLMs), widely used in computer vision and natural language understanding tasks. In particular, LLaMA3 models have recently been released and have achieved impressive performance in various domains with super-large scale pre-training on over 15T tokens of data. Given the wide application of low-bit quantization for LLMs in resource-constrained scenarios, we explore LLaMA3's capabilities when quantized to low bit-width. This exploration can potentially provide new insights and challenges for the low-bit quantization of LLaMA3 and other future LLMs, especially in addressing performance degradation issues that suffer in LLM compression. Specifically, we comprehensively evaluate the 10 existing post-training quantization and LoRA fine-tuning (LoRA-FT) methods of LLaMA3 on 1-8 bits and various datasets to reveal the low-bit quantization performance of LLaMA3. To uncover the capabilities of low-bit quantized MLLM, we assessed the performance of the LLaMA3-based LLaVA-Next-8B model under 2-4 ultra-low bits with post-training quantization methods. Our experimental results indicate that LLaMA3 still suffers from non-negligible degradation in linguistic and visual contexts, particularly under ultra-low bit widths. This highlights the significant performance gap at low bit-width that needs to be addressed in future developments. We expect that this empirical study will prove valuable in advancing future models, driving LLMs and MLLMs to achieve higher accuracy at lower bit to enhance practicality.

摘要

LLaMA家族是一系列基础语言模型,参数范围从70亿到650亿,已成为最强大的开源大语言模型之一,也是多模态大语言模型(MLLM)中流行的大语言模型主干,广泛应用于计算机视觉和自然语言理解任务。特别是,LLaMA3模型最近已经发布,并通过对超过15T数据令牌进行超大规模预训练,在各个领域取得了令人瞩目的性能。鉴于大语言模型的低比特量化在资源受限场景中的广泛应用,我们探索了量化到低比特宽度时LLaMA3的能力。这一探索可能为LLaMA3和其他未来大语言模型的低比特量化提供新的见解和挑战,特别是在解决大语言模型压缩中出现的性能下降问题方面。具体而言,我们在1-8比特和各种数据集上全面评估了LLaMA3现有的10种训练后量化和LoRA微调(LoRA-FT)方法,以揭示LLaMA3的低比特量化性能。为了揭示低比特量化的MLLM的能力,我们使用训练后量化方法评估了基于LLaMA3的LLaVA-Next-8B模型在2-4超低比特下的性能。我们的实验结果表明,LLaMA3在语言和视觉环境中仍然存在不可忽视的性能下降,特别是在超低比特宽度下。这凸显了在未来发展中需要解决的低比特宽度下的显著性能差距。我们期望这项实证研究将对推进未来模型具有价值,推动大语言模型和多模态大语言模型在更低比特下实现更高的准确性,以提高实用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64da/11728678/ef71eb01ce74/44267_2024_70_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64da/11728678/3401fde40a64/44267_2024_70_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64da/11728678/ad8853246a4b/44267_2024_70_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64da/11728678/a3e97f50b70b/44267_2024_70_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64da/11728678/84896bd4764b/44267_2024_70_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64da/11728678/f63ac55d6284/44267_2024_70_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64da/11728678/ef71eb01ce74/44267_2024_70_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64da/11728678/3401fde40a64/44267_2024_70_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64da/11728678/ad8853246a4b/44267_2024_70_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64da/11728678/a3e97f50b70b/44267_2024_70_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64da/11728678/84896bd4764b/44267_2024_70_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64da/11728678/f63ac55d6284/44267_2024_70_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64da/11728678/ef71eb01ce74/44267_2024_70_Fig6_HTML.jpg

相似文献

1
An empirical study of LLaMA3 quantization: from LLMs to MLLMs.LLaMA3量化的实证研究:从大语言模型到多模态大语言模型
Vis Intell. 2024;2(1):36. doi: 10.1007/s44267-024-00070-x. Epub 2024 Dec 30.
2
Enhancing semantical text understanding with fine-tuned large language models: A case study on Quora Question Pair duplicate identification.使用微调的大语言模型增强语义文本理解:以Quora问题对重复识别为例的研究
PLoS One. 2025 Jan 10;20(1):e0317042. doi: 10.1371/journal.pone.0317042. eCollection 2025.
3
PH-LLM: Public Health Large Language Models for Infoveillance.PH-LLM:用于信息监测的公共卫生大语言模型
medRxiv. 2025 Feb 10:2025.02.08.25321587. doi: 10.1101/2025.02.08.25321587.
4
Large language model to multimodal large language model: A journey to shape the biological macromolecules to biological sciences and medicine.从大语言模型到多模态大语言模型:塑造生物大分子以服务生物科学与医学的征程。
Mol Ther Nucleic Acids. 2024 Jun 15;35(3):102255. doi: 10.1016/j.omtn.2024.102255. eCollection 2024 Sep 10.
5
CACTUS: Chemistry Agent Connecting Tool Usage to Science.仙人掌:将化学试剂连接工具的使用与科学相结合。
ACS Omega. 2024 Oct 25;9(46):46563-46573. doi: 10.1021/acsomega.4c08408. eCollection 2024 Nov 19.
6
Performance of large language models for CAD-RADS 2.0 classification derived from cardiac CT reports.基于心脏CT报告的大语言模型对CAD-RADS 2.0分类的性能。
J Cardiovasc Comput Tomogr. 2025 May-Jun;19(3):322-330. doi: 10.1016/j.jcct.2025.03.007. Epub 2025 Apr 9.
7
Advancing entity recognition in biomedicine via instruction tuning of large language models.通过指令调整大型语言模型推进生物医学中的实体识别。
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae163.
8
Comprehensive testing of large language models for extraction of structured data in pathology.用于病理学结构化数据提取的大语言模型综合测试
Commun Med (Lond). 2025 Mar 31;5(1):96. doi: 10.1038/s43856-025-00808-8.
9
Privacy-ensuring Open-weights Large Language Models Are Competitive with Closed-weights GPT-4o in Extracting Chest Radiography Findings from Free-Text Reports.在从自由文本报告中提取胸部X光检查结果方面,确保隐私的开放权重大型语言模型与封闭权重的GPT-4o具有竞争力。
Radiology. 2025 Jan;314(1):e240895. doi: 10.1148/radiol.240895.
10
Benchmarking of Large Language Models for the Dental Admission Test.用于牙科入学考试的大语言模型基准测试。
Health Data Sci. 2025 Apr 1;5:0250. doi: 10.34133/hds.0250. eCollection 2025.