SQ8

The differences between INT8Flat and SQ8 are as follows: INT8 is quantized externally, and the input feature of the index is of the INT8 type. SQ8 is quantized internally, and the input feature of the index is of the Float32 type.

Usage

python3 sq8_generate_model.py -d <dim> --cores <core_num> -p <process_id> -pool <pool_size> -t <npu_type>

Parameter

<dim>: feature vector dimension. The default value is 128.

<core_num>: number of AI Cores of the Ascend AI Processor. The default value is 2. The value of this parameter is determined by <npu_type>. When <npu_type> is set to 310, set <core_num> to 2. When <npu_type> is set to 310P, set <core_num> to 8.

<process_id>: ID of the process for multi-process scheduling of operators generated in batches. The default value is 0, and you do not need to set this parameter.

<pool_size>: size of the process pool for multi-process scheduling of operators generated in batches. The default value is 10.

<npu_type>: hardware form. Currently, Atlas 200/300/500 inference product and Atlas inference product are supported. The value can be 310 (default) or 310P.

--help | -h: help information.

Description

Run the command to obtain a group of SQ8 operator model files for distance calculation. You need to modify the parameters in the command.

Restrictions

  • dim ∈ {64, 128, 256, 384, 512, 768}
  • 0 ≤ pool_size ≤ 32