create_quant_retrain_config
Applicability
Product |
Supported |
|---|---|
|
|
|
|
|
|
|
|
|
Description
Finds all quantizable layers in a graph, creates a quantization configuration file, and writes the quantization configuration of the quantizable layers to the configuration file.
Prototype
1 | create_quant_retrain_config(config_file, model, input_data, config_defination=None) |
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
config_file |
Input |
Path (including the file name) of the QAT configuration file. The existing file (if any) in the path will be overwritten upon this API call. A string. |
model |
Input |
Original model for QAT, with weights loaded. A torch.nn.Module. |
input_data |
Input |
Input data of the model. A torch.tensor is replaced with an equivalent tuple(torch.tensor). A tuple. |
config_defination |
Input |
Simplified configuration file. The simplified configuration file quant.cfg is generated based on the retrain_config_pytorch.proto file. The *.proto file is stored in /amct_pytorch/proto/ under the AMCT installation directory. For details about the parameters in the *.proto file and the generated simplified quantization configuration file quant.cfg, see Simplified QAT Configuration File. Default: None A string. |
Returns
None
Example
1 2 3 4 5 6 7 8 9 10 | import amct_pytorch as amct # Build a graph of the network to be quantized. model = build_model() model.load_state_dict(torch.load(state_dict_path)) input_data = tuple([torch.randn(input_shape)]) # Create a quantization configuration file. amct.create_quant_retrain_config(config_file="./configs/config.json", model=model, input_data=input_data) |
Flush file: a quantization configuration file in JSON format. The following is an example. (The configuration file output by this API will be overwritten when QAT is performed again.) For details about the parameters, see Quantization Configuration File.
- INT8 quantization configuration file
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
{ "version":1, "batch_num":1, "conv1":{ "retrain_enable":true, "retrain_data_config":{ "algo":"ulq_quantize", "dst_type":"INT8" }, "retrain_weight_config":{ "algo":"arq_retrain", "channel_wise":true, "dst_type":"INT8" } }, "layer1.0.conv1":{ "retrain_enable":true, "retrain_data_config":{ "algo":"ulq_quantize", "dst_type":"INT8" }, "retrain_weight_config":{ "algo":"arq_retrain", "channel_wise":true, "dst_type":"INT8" } }, "fc":{ "retrain_enable":true, "retrain_data_config":{ "algo":"ulq_quantize", "dst_type":"INT8" }, "retrain_weight_config":{ "algo":"arq_retrain", "channel_wise":false, "dst_type":"INT8" } } }
- INT4 quantization configuration file
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
{ "version":1, "batch_num":2, "conv1":{ "retrain_enable":true, "retrain_data_config":{ "algo":"ulq_quantize", "dst_type":"INT8" }, "retrain_weight_config":{ "algo":"arq_retrain", "channel_wise":true, "dst_type":"INT8" } }, "layer1.0.conv1":{ "retrain_enable":true, "retrain_data_config":{ "algo":"ulq_quantize", "dst_type":"INT4" }, "retrain_weight_config":{ "algo":"arq_retrain", "channel_wise":true, "dst_type":"INT4" } }, "fc":{ "retrain_enable":true, "retrain_data_config":{ "algo":"ulq_quantize", "dst_type":"INT4" }, "retrain_weight_config":{ "algo":"arq_retrain", "channel_wise":true, "dst_type":"INT4" } } }