Offline Model File Preparation

Non-Quantized Offline Model File

The following describes how to obtain an offline model using the ATC model conversion tool. For more operations, see the ATC Instructions.

  1. Log in to an Ascend AI environment where the Ascend-CANN-Toolkit has been installed.
  2. Obtain the original model files and save them in any directory.

    Example: resnet50.prototxt and resnet50.caffemodel

  3. Enable operator fusion and execute ATC-based model conversion.
    atc --model=$HOME/module/resnet50.prototxt --weight=$HOME/module/resnet50.caffemodel --framework=0 --output=$HOME/module/out/caffe_resnet50_on --soc_version=<soc_version> 

    During model conversion, operator fusion is enabled by default and does not need to be configured.

    You should see information similar to the following if the conversion is successful.
    1
    ATC run success
    

    After the command is successfully executed, an offline model (for example, caffe_resnet50_on.om) is generated in the $HOME/module/out/ directory.

  4. Disable operator fusion and execute ATC-based model conversion.
    atc --model=$HOME/module/resnet50.prototxt --weight=$HOME/module/resnet50.caffemodel --framework=0 --output=$HOME/module/out/caffe_resnet50_off --soc_version=<soc_version>  --fusion_switch_file=$HOME/module/fusion_switch.cfg

    To disable operator fusion, use the --fusion_switch_file option to specify the operator fusion pattern configuration file (for example, fusion_switch.cfg) and disable operator fusion in the file. The configuration for disabling operator fusion in the operator fusion pattern configuration file is as follows:

    {
        "Switch":{
            "GraphFusion":{
                "ALL":"off"
            },
            "UBFusion":{
                "ALL":"off"
             }
        }
    }
    You should see information similar to the following if the conversion is successful.
    1
    ATC run success
    

    After the command is successfully executed, an offline model (for example, caffe_resnet50_off.om) is generated in the $HOME/module/out/ directory.

Quantized Offline Model File

The following uses a Caffe model as an example to describe how to obtain the quantization information file by using the AMCT tool. For more operations, see the AMCT Instructions.

  1. Install the tool. For details, see Tool Installation in the AMCT Instructions.
  2. Obtain the original model files and save them in any directory.

    Example: resnet50.prototxt and resnet50.caffemodel

  3. Prepare a binary dataset that matches the model.
    1. Switch to the amct_caffe/cmd directory and run the following command to download the calibration dataset:
      cd data 
      mkdir image && cd image
      wget https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/models/amct_acl/classification/calibration.rar
      unrar e calibration.rar
    2. In the amct_caffe/cmd directory, run the following command to convert the .jpg dataset in the calibration folder to a .bin dataset:
      python3 ./src/process_data.py

      After the execution is complete, a new calibration folder is generated in the data folder, containing the generated calibration.bin dataset.

  4. Run the following command to perform quantization on the network model:
    amct_caffe calibration --model=./model/resnet50.prototxt --weights=./model/resnet50.caffemodel --save_path=./results --input_shape="data:1,3,224,224" --data_dir="./data/calibration" --data_types="float32"
  5. Check the quantization result. If the following information is displayed with no error log, the quantization is successful.
    1
    2
    INFO - [AMCT]:[Utils]: The weights_file is saved in $HOME/xxx/results/resnet50_fake_quant_weights.caffemodel
    INFO - [AMCT]:[Utils]: The model_file is saved in $HOME/xxx/results/resnet50_fake_quant_model.prototxt
    
    The resultant files and directories are described as follows:
    • resnet50_quant.json: quantization information file. This file gives the node mapping between the quantized model and the original model and is used for accuracy comparison between the quantized model and the original model.
    • resnet50_deploy_model.prototxt: quantized model file to be deployed on the Ascend AI Processor.
    • resnet50_deploy_weights.caffemodel: weight file of the quantized model to be deployed on the Ascend AI Processor.
    • resnet50_fake_quant_model.prototxt: quantized model file for accuracy simulation in the Caffe environment.
    • resnet50_fake_quant_weights.caffemodel: weight file of the quantized model for accuracy simulation in the Caffe environment.
  6. Refer to Non-Quantized Offline Model File and use ATC to convert the quantized original model files resnet50_deploy_model.prototxt and resnet50_deploy_weights.caffemodel to obtain the quantized offline model files with operator fusion enabled and disabled.