开发者
下载

LAC

LAC(Learnable Activation Clipping,可学习激活裁剪范围),是面向神经网络量化(尤其LLM)的可微激活范围优化技术,在可学习裁剪外加入通道缩放类结构变换,增强低比特W/A量化精度。算法详细介绍请参见:OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models