Accuracy-based Automatic Quantization
Accuracy-based automatic quantization is used when users have certain requirements on quantization accuracy. It is implemented by using the Python APIs provided by AMCT. This method produces a quantized model that yields satisfactory accuracy by automatically searching for model quantization configurations and implementing PTQ.
Currently, only the PyTorch, TensorFlow, and Caffe frameworks support accuracy-based automatic quantization. For details, see AMCT Instructions.
Parent topic: Compressing a Model Using the AMCT