Getting Started
This chapter describes the basic workflow of developing an image classification app by using pyACL APIs (Python), as well as the fundamental concepts.
Image Classification App
An image classification app, as the name implies, is used to classify images into different categories.
To build an image classification app, you need to prepare an image classification model. You can use an open-source, pre-trained model, and optimize and retrain it if necessary. Alternatively, you can build your own model from scratch.
For beginners, this document uses an open-source, pre-trained ONNX ResNet-50 model as an example.
Introduction to the ResNet-50 model:
- Input data: RGB input images with 224 x 224 resolution.
- Output data: image class indexes and confidence values
- The confidence value indicates the possibility that an image belongs to a certain class.
- You can view the mapping in the dataset used for model training.
Prerequisites
- The Ascend AI Software Stack has been deployed.
- For details about the installation environment, see App Development Environment Setup.
- Install necessary Python software dependencies (Pillow and NumPy).
pip3 install pillow numpy
Fundamental Concepts
- Host
A host refers to the x86 or Arm server connected to the device. The host utilizes the NN compute capability provided by the device to implement services.
- Device
A device is a hardware device with Ascend AI Processor installed. It connects to the host over the PCIe interface and provides the NN compute capability.
- Development environment and operating environment
The development environment refers to the environment for code development, and the operating environment refers to the environment for running operators, inference, or training programs. The operating environment must contain Ascend AI Processor.
You can log in to the corresponding environment and run the arch command to query the OS architecture.
- Running user
Development Workflow
Python Ascend Computing Language (pyACL) is a Python API library encapsulated using CPython based on AscendCL. You can use Python to manage the running and resources of Ascend AI Processors, facilitating deep learning inference computing, graphics and image preprocessing, and single-operator accelerated computing on the Ascend CANN platform.
After learning these basic steps, you can explore the key functions involved in app development, the supporting pyACL APIs, and the API connection in series.
To sum up, this chapter helps you understand the overall code logic.
Creating a Code Directory
Create the first_app code directory (for example, $HOME) in the development environment by referring to the following directory structure:
first_app
├── data
│ ├── dog1_1024_683.jpg // Test picture 1
│ └── dog2_1024_683.jpg // Test picture 2
└── model // Store ONNX ResNet-50 model files.
└── resnet50.onnx
You need to prepare the following data and models:
- Prepare test data. In this example, two animal images are required. Obtain the images from the following link and upload the downloaded images to the first_app/data directory:
- Prepare model data. Run the following command to download the ONNX model to the model directory, or click model to download it to the local host and upload the model to the operating environment.
cd $HOME/first_app/model wget https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/003_Atc_Models/resnet50/resnet50.onnx
- Model conversion: For a model of an open-source framework, inference cannot be directly performed on Ascend AI Processor. You need to use the Ascend Tensor Compiler (ATC) tool to convert the network model of the open-source framework into an offline model (*.om file) that adapts to Ascend AI Processor.
Run the following command to convert the original model into an .om model file that can be identified by Ascend AI Processor: Note that the running user must have the read and write permissions on the related files specified in the command.
atc --model=resnet50.onnx --framework=5 --output=resnet50 --input_shape="actual_input_1:1,3,224,224" -- soc_version=<soc_version>
The parameters are described as follows. For details about the restrictions, see ATC Instructions.
- --model: path of the ResNet-50 model file.
- --framework: source framework type. 5 indicates ONNX.
- --output: path of the resnet50.om model file. Record this path for future use during app development.
- --input_shape: shape of the input data of the model.
- --soc_version: version of Ascend AI Processor.
If you cannot determine the soc_version of the current device, perform the following steps:
App Development
Create the first_app.py file in the first_app directory and add the following content to the file.
- Introduce necessary modules of pyACL and define pyACL constants.
1 2 3 4 5 6 7 8 9
import os import acl import numpy as np from PIL import Image ACL_MEM_MALLOC_HUGE_FIRST = 0 ACL_MEMCPY_HOST_TO_DEVICE = 1 ACL_MEMCPY_DEVICE_TO_HOST = 2
- Define a model object.The network model object contains the following functions:
- Initialization function
- Inference execution function
- Destructor function
For subsequent use, you only need to call the forward function in the network model and transfer the input data to obtain the output.
1 2 3 4 5 6 7 8 9 10
class net: # def __init__(self, model_path): # Initialization function, which needs to be implemented in subsequent steps. # def forward(self, inputs): # Execute the inference task, which needs to be implemented in subsequent steps. # def __del__(self): # Destructor function, which releases resources in the reverse order of initializing resources. This function needs to be implemented in subsequent steps.
- Implement the initialization. The following steps are involved (implement the initialization in the net class):
- Call the acl.init API for initialization. Before using pyACL to develop applications, you need to initialize pyACL. (After all pyACL APIs are called, you need to deinitialize pyACL.) During initialization, configuration parameters (for example, performance-related collection information) can be transferred to the initialization API through the JSON configuration file.
- Invoke the acl.rt.set_device interface to specify a computing device based on the ID.
- Load a model.
- Create the input data set and output data set. The corresponding methods are implemented in 4.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
def __init__(self, model_path): # Initialization function self.device_id = 0 # Step 1: Perform initialization. ret = acl.init() # Specify a compute device. ret = acl.rt.set_device(self.device_id) # Step 2: Load the model. In this example, the ResNet-50 model is used. # Load an offline model file. A model ID is returned. self.model_id, ret = acl.mdl.load_from_file(model_path) # Create blank model description information and obtain the pointer address of the model description information. self.model_desc = acl.mdl.create_desc() # Fill the model description in model_desc based on the model ID. ret = acl.mdl.get_desc(self.model_desc, self.model_id) # Step 3: Create input and output data sets. # Create an input data set. self.input_dataset, self.input_data = self.prepare_dataset('input') # Create an output data set. self.output_dataset, self.output_data = self.prepare_dataset('output')
- Implement the dataset creation method in the net class.
The input and output data for inference must be stored based on the data types specified by pyACL. The related data types are as follows:
- aclmdlDesc data describes the basic information of your model (such as the input/output count, and the name, data type, format, and shape of each input/output).
After the model is successfully loaded, you can call the API of this data type based on the model ID to obtain the model description.
- aclDataBuffer data describes the buffer address and size of each input/output.
Call the operation API of the aclDataBuffer type to obtain the memory address and memory size so that the input data can be stored in the memory and the output data can be obtained.
- aclmdlDataset data describes the input and output data of a model.
You can call the API of this data type to add multiple pieces of data of the aclDataBuffer type when the model has multiple inputs and outputs.
Figure 3 Relationship between aclmdlDataset and aclDataBuffer
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
def prepare_dataset(self, io_type): # Prepare a dataset. if io_type == "input": # Obtain the number of model inputs. io_num = acl.mdl.get_num_inputs(self.model_desc) acl_mdl_get_size_by_index = acl.mdl.get_input_size_by_index else: # Obtain the number of model outputs. io_num = acl.mdl.get_num_outputs(self.model_desc) acl_mdl_get_size_by_index = acl.mdl.get_output_size_by_index # Create data of the aclmdlDataset type to describe the input for model inference. dataset = acl.mdl.create_dataset() datas = [] for i in range(io_num): # Obtain the required buffer memory size. buffer_size = acl_mdl_get_size_by_index(self.model_desc, i) # Allocate buffer memory. buffer, ret = acl.rt.malloc(buffer_size, ACL_MEM_MALLOC_HUGE_FIRST) # Create buffer data from the memory. data_buffer = acl.create_data_buffer(buffer, buffer_size) # Add buffer data to the dataset. _, ret = acl.mdl.add_dataset_buffer(dataset, data_buffer) datas.append({"buffer": buffer, "data": data_buffer, "size": buffer_size}) return dataset, datas
- aclmdlDesc data describes the basic information of your model (such as the input/output count, and the name, data type, format, and shape of each input/output).
- Implement the synchronous inference method in the net class.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
def forward(self, inputs): # Run an inference job. # Traverse all inputs and copy them to the corresponding buffer memory. input_num = len(inputs) for i in range(input_num): bytes_data = inputs[i].tobytes() bytes_ptr = acl.util.bytes_to_ptr(bytes_data) # Transfer image data from the host to the device. ret = acl.rt.memcpy(self.input_data[i]["buffer"], # destination address on device self.input_data[i]["size"], # size of the destination address bytes_ptr, # source address host len(bytes_data), # size of the source address ACL_MEMCPY_HOST_TO_DEVICE) # mode: from host to device # Perform model inference. ret = acl.mdl.execute(self.model_id, self.input_dataset, self.output_dataset) # Process the model inference output and print the class indexes corresponding to the top 5 confidence values. inference_result = [] for i, item in enumerate(self.output_data): buffer_host, ret = acl.rt.malloc_host(self.output_data[i]["size"]) # Transfer the inference output data from the device to the host. ret = acl.rt.memcpy(buffer_host, # destination address on host self.output_data[i]["size"], # size of destination address self.output_data[i]["buffer"], # source address on device self.output_data[i]["size"], # size of source address ACL_MEMCPY_DEVICE_TO_HOST) # mode: from device to host # Obtain the bytes object from the memory address. bytes_out = acl.util.ptr_to_bytes(buffer_host, self.output_data[i]["size"]) # Convert data into a NumPy array in float32 format. data = np.frombuffer(bytes_out, dtype=np.float32) inference_result.append(data) # Free the buffers. ret = acl.rt.free_host(buffer_host) vals = np.array(inference_result).flatten() # Perform softmax conversion on the result. vals = np.exp(vals) vals = vals / np.sum(vals) return vals
- Implement the destructor method in the net class.
- Destroy dataset resources (buffer data, buffer memory, input dataset, and output dataset).
- Destroy the model description and unload the model.
- Release computing resources.
- After all pyACL APIs are called (or before the process exits), call the acl.finalize API to deinitialize pyACL.
Exceptions may occur during inference. Add the resource release function to the destructor to ensure that resources can be correctly released. The following content is for reference only. In actual situations, you need to consider resource release in more cases.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
def __del__(self): # The destructor releases resources in the reverse order of initializing resources. # Destroy the input and output data sets. for dataset in [self.input_data, self.output_data]: while dataset: item = dataset.pop() ret = acl.destroy_data_buffer(item["data"]) # Destroy buffer data. ret = acl.rt.free(item["buffer"]) # Destroy buffer memory. ret = acl.mdl.destroy_dataset(self.input_dataset) # Destroy input dataset. ret = acl.mdl.destroy_dataset(self.output_dataset) # Destroy output dataset. # Destruction model description ret = acl.mdl.destroy_desc(self.model_desc) # Unload the model. ret = acl.mdl.unload(self.model_id) # Release the device. ret = acl.rt.reset_device(self.device_id) # Deinitialize the ACL. ret = acl.finalize()
- Pre-processing the image.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
def transfer_pic(input_path): # Image pre-processing input_path = os.path.abspath(input_path) with Image.open(input_path) as image_file: # Resizing to 224 x 224 img = image_file.resize((224, 224)) # Converting to ndarray of the float32 type img = np.array(img).astype(np.float32) # Normalizing image pixels based on the average value and variance of imageNet images img -= [123.675, 116.28, 103.53] img /= [58.395, 57.12, 57.375] # Switching sequence of RGB to BGR img = img[:, :, ::-1] # Color channel first for resnet50 img = img.transpose((2, 0, 1)) # Returning and adding the batch channel return np.array([img])
- Call forward (for details, see 5) to perform synchronous inference and print the top 5 class indexes and confidence values on the screen.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
def print_top_5(data): top_5 = data.argsort()[::-1][:5] print("======== top5 inference results: =============") for j in top_5: print("[%d]: %f" % (j, data[j])) if __name__ == "__main__": resnet50 = net('./model/resnet50.om') image_paths = ["./data/dog1_1024_683.jpg", "./data/dog2_1024_683.jpg"] for path in image_paths: # Image pre-processing. The following is for reference only. You can pre-process images as required. image = transfer_pic(path) # Transfer a list of data based on the sequence of each input. In the example, the ResNet-50 model has only one input. result = resnet50.forward([image]) # Output top_5. print_top_5(result) del resnet50
App Execution
Upload the compiled first_app folder and its content to the operating environment, go to the code directory, check whether the environment variables are correctly configured, and run the following command:
python3 first_app.py
The following output is obtained, which is the top 5 class indexes of the two test images.
[161]: 0.809159 indicates that the confidence of the class index 161 is 0.809159.
======== top5 inference results: ============= [161]: 0.809159 [162]: 0.103680 [178]: 0.017600 [166]: 0.013922 [212]: 0.009644 ======== top5 inference results: ============= [267]: 0.728299 [266]: 0.101693 [265]: 0.100117 [151]: 0.004214 [160]: 0.002721
The mapping between class indexes and classes is related to the dataset used for model training. The model used in this example is trained based on the ImageNet dataset. You can look up the mapping of your dataset on the Internet.
In this example, the mapping is as follows:
"161": ["basset", "basset hound"]
"162": ["beagle"]
"163": ["bloodhound", "sleuthhound"]
"166": ["Walker hound", "Walker foxhound"]
"167": ["English foxhound"]