Sample List

Table 1 Samples

Framework

Feature

How to Obtain

PyTorch

Accuracy-based Automatic Quantization

Click auto_calibration to obtain the sample. For details, see the readme file.

Uniform Quantization

Click calibration to obtain the sample. For details, see the readme file.

NUQ

Click calibration_nuq to obtain the sample. For details, see the readme file.

QAT

Click retrain to obtain the sample. For details, see the readme file.

Auto Channel Pruning Search

Click auto_channel_prune to obtain the sample. For details, see the readme file.

Filter-Level Sparsity

Click channel_prune to obtain the sample. For details, see the readme file.

2:4 Structured Sparsity

Click selective_prune to obtain the sample. For details, see the readme file.

Compression Combination

Click mix_compression to obtain the sample. For details, see the readme file.

Tensor Decomposition

Click tensor_decompose to obtain the sample. For details, see the readme file.

QAT in Single-Operator Mode

Click retrain_qat_op to obtain the sample. For details, see the readme file.

Layer-wise Distillation

Click distillation to obtain the sample. For details, see the readme file.

ADA Adaptive Rounding Quantization

Click ada_round_calibration to obtain the sample. For details, see the readme file.

PyTorch

KV Cache Quantization

Click kv_cache_quantization to obtain the sample. For details, see the readme file.

ONNX

CLI-based Quantization

  • PTQ using the CLI
  • QAT model adaptation to CANN format using the CLI

Click cmd to obtain the sample. For details, see the readme file.

Accuracy-based Automatic Quantization

Click accuracy_based_auto_calibration to obtain the sample. For details, see the readme file for accuracy-based automatic quantization.

Uniform Quantization

Click calibration to obtain the sample. For details, see the readme file for uniform quantization.

NUQ

Click calibration_nuq to obtain the sample. For details, see the readme file for non-uniform quantization.

QAT Model Adaptation to CANN Format

Click convert_qat2ascend to obtain the sample. For details, see the readme file for converting a QAT model to a CANN model.

TensorFlow

CLI-based Quantization

  • PTQ using the CLI
  • QAT model adaptation to CANN format using the CLI

Click cmd to obtain the sample. For details, see the readme file.

Accuracy-based Automatic Quantization

Click auto_calibration to obtain the sample. For details, see the readme file.

Uniform Quantization

Click calibration to obtain the sample. For details, see the readme file.

NUQ

Click calibration_nuq to obtain the sample. For details, see the readme file.

QAT

Click retrain to obtain the sample. For details, see the readme file.

Auto Channel Pruning Search

Click auto_channel_prune to obtain the sample. For details, see the readme file.

Filter-Level Sparsity (Manual Sparsity)

Click channel_prune to obtain the sample. For details, see the readme file.

2:4 Structured Sparsity

Click selective_prune to obtain the sample. For details, see the readme file.

Compression Combination

Click mix_compression to obtain the sample. For details, see the readme file.

Tensor Decomposition

Click tensor_decompose to obtain the sample. For details, see the readme file.

Model Adaptation Using convert_model API

Click convert_model to obtain the sample. For details, see the readme file.

QAT Model Adaptation to CANN Format

Click convert_qat2ascend to obtain the sample. For details, see the readme file.

Caffe

CLI-based Quantization

Click cmd to obtain the sample. For details, see the readme file.

Accuracy-based Automatic Quantization

Click auto_calibration to obtain the sample. For details, see the readme file.

Uniform Quantization

Click calibration to obtain the sample. For details, see the readme file.

NUQ

  • Automatic non-uniform quantization: Click auto_calibration_nuq to obtain the sample. For details, see the readme file.
  • Static non-uniform quantization: Click calibration_nuq to obtain the sample. For details, see the readme file.

QAT

Click retrain to obtain the sample. For details, see the readme file.

Tensor Decomposition

Click tensor_decompose to obtain the sample. For details, see the readme file.

Model Adaptation

Click convert_model to obtain the sample. For details, see the readme file.

TensorFlow, Ascend

MobileNetV2

Post-training quantization of the classification network model. Click amct_tensorflow_ascend to obtain the sample from the mobilenetv2 directory. For details, see the readme file.

YOLOv3

Post-training quantization of the detection network model. Click amct_tensorflow_ascend to obtain the sample from the yolov3 directory. For details, see the readme file.