• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

A survey of low-bit large language models: Basics, systems, and algorithms.

作者信息

Gong Ruihao, Ding Yifu, Wang Zining, Lv Chengtao, Zheng Xingyu, Du Jinyang, Yong Yang, Gu Shiqiao, Qin Haotong, Guo Jinyang, Lin Dahua, Magno Michele, Liu Xianglong

机构信息

Beihang University, 37 Xueyuan Road, Haidian District, 100191, Beijing, China.

ETH Zurich, Rämistrasse 101, Zurich, 8092, Zurich, Switzerland.

出版信息

Neural Netw. 2025 Jul 10;192:107856. doi: 10.1016/j.neunet.2025.107856.

DOI:10.1016/j.neunet.2025.107856
PMID:40782663
Abstract

Large language models (LLMs) have achieved remarkable advancements in natural language processing, showcasing exceptional performance across various tasks. However, the expensive memory and computational requirements present significant challenges for their practical deployment. Low-bit quantization has emerged as a critical approach to mitigate these challenges by reducing the bit-width of model parameters, activations, and gradients, thus decreasing memory usage and computational demands. This paper presents a comprehensive survey of low-bit quantization methods tailored for LLMs, covering the fundamental principles, system implementations, and algorithmic strategies. An overview of basic concepts and new data formats specific to low-bit LLMs is first introduced, followed by a review of frameworks and systems that facilitate low-bit LLMs across various hardware platforms. Then, we categorize and analyze techniques and toolkits for efficient low-bit training and inference of LLMs. Finally, we conclude with a discussion of future trends and potential advancements of low-bit LLMs. Our systematic overview from basic, system, and algorithm perspectives can offer valuable insights and guidelines for future works to enhance the efficiency and applicability of LLMs through low-bit quantization.

摘要

相似文献

1
A survey of low-bit large language models: Basics, systems, and algorithms.
Neural Netw. 2025 Jul 10;192:107856. doi: 10.1016/j.neunet.2025.107856.
2
PT-BitNet: Scaling up the 1-Bit large language model with post-training quantization.PT-BitNet:通过训练后量化扩展1位大语言模型
Neural Netw. 2025 Nov;191:107855. doi: 10.1016/j.neunet.2025.107855. Epub 2025 Jul 9.
3
Short-Term Memory Impairment短期记忆障碍
4
Algorithmic Classification of Psychiatric Disorder-Related Spontaneous Communication Using Large Language Model Embeddings: Algorithm Development and Validation.使用大语言模型嵌入对精神障碍相关自发交流进行算法分类:算法开发与验证
JMIR AI. 2025 May 30;4:e67369. doi: 10.2196/67369.
5
Evaluating and Improving Syndrome Differentiation Thinking Ability in Large Language Models: Method Development Study.评估和提高大语言模型中的辨证思维能力:方法开发研究
JMIR Med Inform. 2025 Jun 20;13:e75103. doi: 10.2196/75103.
6
Implementing Large Language Models in Health Care: Clinician-Focused Review With Interactive Guideline.在医疗保健中应用大语言模型:以临床医生为重点的回顾与交互式指南
J Med Internet Res. 2025 Jul 11;27:e71916. doi: 10.2196/71916.
7
Aligning Large Language Models for Enhancing Psychiatric Interviews Through Symptom Delineation and Summarization: Pilot Study.通过症状描述和总结调整大型语言模型以增强精神病学访谈:初步研究。
JMIR Form Res. 2024 Oct 24;8:e58418. doi: 10.2196/58418.
8
NUPES: Non-Uniform Post-Training Quantization via Power Exponent Search.
IEEE Trans Pattern Anal Mach Intell. 2025 Nov;47(11):10012-10021. doi: 10.1109/TPAMI.2025.3593987.
9
Developing healthcare language model embedding spaces.开发医疗保健语言模型嵌入空间。
Artif Intell Med. 2024 Dec;158:103009. doi: 10.1016/j.artmed.2024.103009. Epub 2024 Oct 31.
10
Improving unified information extraction in Chinese mental health domain with instruction-tuned LLMs and type-verification component.使用指令微调的语言模型和类型验证组件改进中文心理健康领域的统一信息提取
Artif Intell Med. 2025 Apr;162:103087. doi: 10.1016/j.artmed.2025.103087. Epub 2025 Feb 19.