Searching for Quantization Layers with Accuracy Drop

Currently, only Caffe models are supported. Use Model Accuracy Analyzer to search for quantization layers with accuracy drop. The process includes the following two steps:

  1. Locate the accuracy issue during the quantization phase.

    Compare the accuracy of a non-quantized original model (GPU/CPU) with that of a quantized original model (GPU/CPU).

  2. Locate the accuracy issue during the model conversion phase, specifically the accuracy issue of the quantized offline model which is running on the NPU.

    Compare the accuracy of a quantized original model (GPU/CPU) with that of a quantized offline model (NPU, fusion pattern disabled).

For details, see Comparison Between GPU/CPU and NPU (Caffe Offline Inference) in Accuracy Debugging Tool Guide.