Accuracy-based Automatic Quantization

Accuracy-based automatic quantization is used when users have certain requirements on quantization accuracy. It is implemented by using the Python APIs provided by AMCT. This method produces a quantized model that yields satisfactory accuracy by automatically searching for model quantization configurations and implementing PTQ.

Currently, only the PyTorch, TensorFlow, and Caffe frameworks support accuracy-based automatic quantization. For details, see AMCT Instructions.

Parent topic: Compressing a Model Using the AMCT