ATC Workflow

This section describes the process of using the AMCT tool. The process varies depending on the framework running environment.

PyTorch/ONNX/TensorFlow/Caffe scenario

Figure 1 shows the toolkit workflow.

Figure 1 Graph running workflow

**Table 1** Key steps in the running process
Operation	Description
Package preparation	Obtain the tool package. For details, see Software Package Preparation.
Preparing for Installation	Before installing AMCT, create an installation user, check the OS environment, install dependencies, and upload the AMCT package. For details, see Preparing for Installation. The configuration varies depending on the framework environment. For details, see the corresponding framework.
Installation	Install the tool by referring to. The installation command varies depending on the framework. For details, see the installation procedure of the corresponding framework.
Post-installation Actions	After the installation is complete, perform related operations by referring to. If the framework in this section is not available, skip this step.
(Optional) Script creation to use AMCT APIs	If the sample provided by AMCT is used for model compression, the APIs in this document can be directly called. The sample code provided in this document is based on Ascend's sample model. You can adapt the sample code to your own model with just a few tweaks.
Compression	Perform the compression operation. AMCT provides two quantization methods: CLI-based and Python API–based. For details about their differences, see Table 1. CLI-based quantization: You only need to prepare the model and its datasets. Currently, only PTQ is supported. Python API–based quantization: You need to understand the Python syntax and detailed quantization process. All quantization modes are supported. Run the provided quantization script or CLI to quantize your original network model with the prepared datasets. AMCT is developed based on deep learning frameworks. During quantization, call the deep learning framework in use for inference or training.
(Follow-up) Inference on the quantized model	You can use ATC to convert the quantized deployable model into an offline model adapted to Ascend AI Processor, and then perform subsequent inference.

TensorFlow and Ascend

Figure 2 shows the toolkit workflow.

Figure 2 AMCT workflow

**Table 2** Key steps in the running process
Operation	Description
Set up the online inference environment with NPUs.	Perform the following steps to set up an online inference environment powered by NPUs. Install the driver, firmware, and CANN software package by referring to CANN Software Installation Guide. Install the TF Adapter plugin package by referring to section " Environment Setup > Installing the Framework Plugin Package" in the TensorFlow 1.15 Model Porting Guide or TensorFlow 2.6.5 Model Porting Guide manual.
Install the CPU version of TensorFlow.	The online inference environment supports only quantization on the NPU as opposed to the GPU. As such, you only need to install the CPU version of TensorFlow. For details about the installation procedure, see AMCT (TensorFlow,Ascend).
Install AMCT.	Install the AMCT of the TensorFlow framework by referring to Installation. Before the installation, you need to obtain the software package, create the AMCT installation user, check the environment, install dependencies, and upload the software package.
(Optional) Script creation to use AMCT APIs	If the sample provided by AMCT is used for model compression, the APIs in this document can be directly called. The sample code provided in this document is based on Ascend's sample model. You can adapt the sample code to your own model with just a few tweaks.
Quantization	Run the provided quantization script to quantize your source network model with the prepared datasets. AMCT is developed based on the deep learning framework. During quantization, the deep learning framework needs to be called to perform necessary inference.
(Follow-up) Run inference on the quantized model.	The quantized .pb model can serve for online inference in the NPU environment. For details about how to perform online inference, see TensorFlow 1.15 Model Porting Guide or TensorFlow 2.6.5 Model Porting Guide based on your TensorFlow version.

Parent topic: Overview