ATC Workflow
This section describes the process of using the AMCT tool. The process varies depending on the framework running environment.
PyTorch/ONNX/TensorFlow/Caffe scenario
Figure 1 shows the toolkit workflow.
Operation |
Description |
|---|---|
Package preparation |
Obtain the tool package. For details, see Software Package Preparation. |
Preparing for Installation |
Before installing AMCT, create an installation user, check the OS environment, install dependencies, and upload the AMCT package. For details, see Preparing for Installation. The configuration varies depending on the framework environment. For details, see the corresponding framework. |
Installation |
Install the tool by referring to. The installation command varies depending on the framework. For details, see the installation procedure of the corresponding framework. |
Post-installation Actions |
After the installation is complete, perform related operations by referring to. If the framework in this section is not available, skip this step. |
(Optional) Script creation to use AMCT APIs |
If the sample provided by AMCT is used for model compression, the APIs in this document can be directly called. The sample code provided in this document is based on Ascend's sample model. You can adapt the sample code to your own model with just a few tweaks. |
Compression |
Perform the compression operation. AMCT provides two quantization methods: CLI-based and Python API–based. For details about their differences, see Table 1.
Run the provided quantization script or CLI to quantize your original network model with the prepared datasets. AMCT is developed based on deep learning frameworks. During quantization, call the deep learning framework in use for inference or training. |
(Follow-up) Inference on the quantized model |
You can use ATC to convert the quantized deployable model into an offline model adapted to Ascend AI Processor, and then perform subsequent inference. |
TensorFlow and Ascend
Figure 2 shows the toolkit workflow.
Operation |
Description |
|---|---|
Set up the online inference environment with NPUs. |
Perform the following steps to set up an online inference environment powered by NPUs.
|
Install the CPU version of TensorFlow. |
The online inference environment supports only quantization on the NPU as opposed to the GPU. As such, you only need to install the CPU version of TensorFlow. For details about the installation procedure, see AMCT (TensorFlow,Ascend). |
Install AMCT. |
Install the AMCT of the TensorFlow framework by referring to Installation. Before the installation, you need to obtain the software package, create the AMCT installation user, check the environment, install dependencies, and upload the software package. |
(Optional) Script creation to use AMCT APIs |
If the sample provided by AMCT is used for model compression, the APIs in this document can be directly called. The sample code provided in this document is based on Ascend's sample model. You can adapt the sample code to your own model with just a few tweaks. |
Quantization |
Run the provided quantization script to quantize your source network model with the prepared datasets. AMCT is developed based on the deep learning framework. During quantization, the deep learning framework needs to be called to perform necessary inference. |
(Follow-up) Run inference on the quantized model. |
The quantized .pb model can serve for online inference in the NPU environment. For details about how to perform online inference, see TensorFlow 1.15 Model Porting Guide or TensorFlow 2.6.5 Model Porting Guide based on your TensorFlow version. |

