Getting Started

This section describes the basic workflow of developing an image classification app by using AscendCL APIs (C++), as well as the fundamental concepts.

Image Classification App

An image classification app, as the name implies, is used to classify images into different classes.

Figure 1 Image classification app

To build an image classification app, you need to prepare an image classification model. You can use an open-source, pre-trained model, and optimize and retrain it if necessary. Alternatively, you can build your own model from scratch.

As this is an introductory tutorial, we will directly obtain a pre-trained open-source model. This method is relatively simple. Here, we use the ONNX ResNet-50 model.

About the ResNet-50 model:

Input data: a BGR input image with a size of 224 × 224
Output data: image class indexes and confidence values

The confidence value indicates the possibility that an image belongs to a certain class.
You can view the mapping in the dataset used for model training.

Fundamental Concepts

Host
A host refers to the x86 or Arm server connected to the device. The host utilizes the neural network (NN) compute capabilities provided by the device to implement services.
Device
A device refers to a hardware device powered by the Ascend AI Processor. It connects to the host over the PCIe interface, providing the NN compute capability.
Development environment and operating environment
The development environment refers to the environment for code building and app development. The operating environment refers to the environment where operator, inference, or training programs run. The operating environment must be equipped with the Ascend AI Processor.

The development environment and operating environment can be deployed on the same server (co-deployment) or different servers (separate deployment).
- Co-deployment: Log in to the server. This does not require switchover between the development environment and operating environment.
- Separate deployment: If different operating system (OS) architectures are used, cross compilation is required in the development environment so that the executable file can be executed in the operating environment.
To check the OS architecture of an environment, you can log in to it and run the uname -a command.
Running user
The user who runs inference or training.

Development Workflow

AscendCL provides a collection of C language APIs for development of DNN inference apps on Compute Architecture for Neural Networks (CANN). These APIs are designed for model and operator loading and execution, as well as media data processing, facilitating deep learning inference computing, graphics and image preprocessing, and single-operator accelerated computing on the Ascend CANN platform.

Figure 2 Development workflow

The following part elaborates on the key functions involved in app development, the AscendCL APIs called by each function, and how the AscendCL APIs are connected.

Although you may not understand all the details now, you can begin with learning about the overall code logic. Then, you can learn about the details by building and runningyour apps and analyzing sample code.

App Build and Run

Click here to obtain the quick start sample, download the sample, prepare the model and test images, and build and run your apps by referring to README.md.

Sample Code Analysis

If you want to learn about the code implementation logic after building and running the quick start sample, see the following code analysis. This section describes the code logic based on the functions of each function. To facilitate code reading and understanding, the variables and functions related to each function are described together.

Open resnet50_firstapp/src/main.cpp, start from the main function to understand the code concatenation logic of the entire sample, and then explore the implementation of the custom functions.

Concatenate the code logic of the entire app using the main function.

int main()
{	
        // 1. Define a resource initialization function for AscendCL initialization and runtime resource allocation (specifying a compute device).
	InitResource();
	
        // 2. Define a model loading function to load the image classification model for inference. If the .om file is not in the model directory, modify the code based on the actual directory.
	const char *modelPath = "../model/resnet50.om";
	LoadModel(modelPath);
	
        // 3. Define a function for reading image data to the buffer and transferring the data to the device for inference.
        const char *picturePath = "../data/dog1_1024_683.bin";
	LoadPicture(picturePath);
	
        // 4. Define an inference function for performing inference.
	Inference();
	
        // 5. Define a function for processing inference result data to print the class indexes of the top 5 confidence values of the test image.
	PrintResult();
	
        // 6. Define a function for unloading the image classification model.
	UnloadModel();
	
        // 7. Define a function for freeing the memory and destroying inference-related data to prevent memory leak.
	UnloadPicture();
	
        // 8. Define a resource deinitialization function for deinitializing AscendCL and destroying runtime resource (resetting a compute device).
	DestroyResource();
}

After examining the overall code logic, we now explore the implementation of the custom functions.

Include dependent header files, including those of AscendCL and C or C++ standard libraries.

#include "acl/acl.h"
#include <iostream>
#include <fstream>
#include <cstring>
#include <map>
#include <math.h>

using namespace std;

Initialize resources.
Resource initialization includes the following two processes:
- Calling aclInit to initialize AscendCL
  If AscendCL is not initialized, errors may occur when internal system resources are being initialized, causing exceptions in other services.
  
  You can pass inference-related configurations (for example, profiling configurations) to the AscendCL initialization API in JSON format. If the default configuration (where profiling is disabled) already meets the requirements, pass nullptr instead.
- Calling aclrtSetDevice to specify the compute device
1 2 3 4 5 6
int32_t deviceId = 0; void InitResource() { aclError ret = aclInit(nullptr); ret = aclrtSetDevice(deviceId); }
The needs to be deinitialized once it is initialized. After all s of AscendCL are invoked or before the application exits, resources need to be deinitialized. For details, see 10.
Load the model.
An .om model file is loaded here. To convert the ONNX ResNet-50 model file into an .om model file, click here and view "Preparing a Model" in README.md.
1 2 3 4 5
uint32_t modelId; void LoadModel(const char* modelPath) { aclError ret = aclmdlLoadFromFile(modelPath, &modelId); }
After model inference is complete, unload the model. For details, see 8.

Read the test image to the buffer and transfer it to the device for inference.

size_t pictureDataSize = 0;
void *pictureHostData;
void *pictureDeviceData;

// Allocate memory. Read the test image to the memory by using the function in the C or C++ standard library.
void ReadPictureToHost(const char *picturePath)
{
	string fileName = picturePath;
	ifstream binFile(fileName, ifstream::binary);
	binFile.seekg(0, binFile.end);
	pictureDataSize = binFile.tellg();
	binFile.seekg(0, binFile.beg);
	aclError ret = aclrtMallocHost(&pictureHostData, pictureDataSize);
	binFile.read((char*)pictureHostData, pictureDataSize);
	binFile.close();
}

// Allocate device memory and transfer the image data in the memory to the device through memory copy.
void CopyDataFromHostToDevice()
{
	aclError ret = aclrtMalloc(&pictureDeviceData, pictureDataSize, ACL_MEM_MALLOC_HUGE_FIRST);
	ret = aclrtMemcpy(pictureDeviceData, pictureDataSize, pictureHostData, pictureDataSize, ACL_MEMCPY_HOST_TO_DEVICE);
}

void LoadPicture(const char* picturePath)
{
	ReadPictureToHost(picturePath);
	CopyDataFromHostToDevice();
}

Perform inference.

The input and output data for inference must be stored based on the data types specified by AscendCL. The related data types are as follows:

aclmdlDesc data describes the basic information of your model (such as the input/output count, buffer size, and the name, data type, format, and shape of each input or output).
After the model is successfully loaded, you can call the API of this data type based on the model ID to obtain information such as the model input and output count, buffer size, dimensions, format, and data types from the model description.

aclDataBuffer data describes the buffer address and size of each input/output.
You can call the API of this data type to obtain the buffer address and size in order to store the input data in the buffer and obtain the output data.

aclmdlDataset data describes the input and output data of a model.
You can call the API of this data type to add multiple pieces of data of the aclDataBuffer type when the model has multiple inputs and outputs.

Figure 3 Relationship between aclmdlDataset and aclDataBuffer

aclmdlDataset *inputDataSet;
aclDataBuffer *inputDataBuffer;
aclmdlDataset *outputDataSet;
aclDataBuffer *outputDataBuffer;
aclmdlDesc *modelDesc;
size_t outputDataSize = 0;
void *outputDeviceData;

// Prepare the input data structure for model inference.
void CreateModelInput()
{
        // Create data of the aclmdlDataset type to describe the input for model inference.
	inputDataSet = aclmdlCreateDataset();
	inputDataBuffer = aclCreateDataBuffer(pictureDeviceData, pictureDataSize);
	aclError ret = aclmdlAddDatasetBuffer(inputDataSet, inputDataBuffer);
}

// Prepare the output data structure for model inference.
void CreateModelOutput()
{
       // Create model description.
	modelDesc =  aclmdlCreateDesc();
	aclError ret = aclmdlGetDesc(modelDesc, modelId);
        // Create data of the aclmdlDataset type to describe the output for model inference.
	outputDataSet = aclmdlCreateDataset();
        // Obtain the buffer size (in bytes) required by the model output data.
	outputDataSize = aclmdlGetOutputSizeByIndex(modelDesc, 0);
        // Allocate output buffer.
	ret = aclrtMalloc(&outputDeviceData, outputDataSize, ACL_MEM_MALLOC_HUGE_FIRST);
	outputDataBuffer = aclCreateDataBuffer(outputDeviceData, outputDataSize);
	ret = aclmdlAddDatasetBuffer(outputDataSet, outputDataBuffer);
}

// Execute the model.
void Inference()
{
        CreateModelInput();
	CreateModelOutput();
	aclError ret = aclmdlExecute(modelId, inputDataSet, outputDataSet);
}

Process the model inference result to print the class indexes of the top 5 confidence values of the image.

void *outputHostData;

void PrintResult()
{
        // Obtain the inference result data.
        aclError ret = aclrtMallocHost(&outputHostData, outputDataSize);
        ret = aclrtMemcpy(outputHostData, outputDataSize, outputDeviceData, outputDataSize, ACL_MEMCPY_DEVICE_TO_HOST);
        // Cast the buffered data to the float type.
        float* outFloatData = reinterpret_cast<float *>(outputHostData);

        // Print the class indexes of the top 5 confidence values of the test image.
        map<float, unsigned int, greater<float>> resultMap;
        for (unsigned int j = 0; j < outputDataSize / sizeof(float);++j)
        {
                resultMap[*outFloatData] = j;
                outFloatData++;
        }

        // do data processing with softmax and print top 5 classes
        double totalValue=0.0;
        for (auto it = resultMap.begin(); it != resultMap.end(); ++it) {
            totalValue += exp(it->first);
        }

        int cnt = 0;
        for (auto it = resultMap.begin();it != resultMap.end();++it)
        {
                if(++cnt > 5)
                {
                        break;
                }
                printf("top %d: index[%d] value[%lf] \n", cnt, it->second, exp(it->first) /totalValue);
        }
}

Unload the model and destroy the model description.

You should perform this step once the inference is complete.

void UnloadModel()
{
        // Destroy the model description.
	aclmdlDestroyDesc(modelDesc);
        // Unload the model.
	aclmdlUnload(modelId);
}

Free memory and destroy inference-related data.

void UnloadPicture()
{
	aclError ret = aclrtFreeHost(pictureHostData);
	pictureHostData = nullptr;
	ret = aclrtFree(pictureDeviceData);
	pictureDeviceData = nullptr;
	aclDestroyDataBuffer(inputDataBuffer);
	inputDataBuffer = nullptr;
	aclmdlDestroyDataset(inputDataSet);
	inputDataSet = nullptr;
	
	ret = aclrtFreeHost(outputHostData);
	outputHostData = nullptr;
	ret = aclrtFree(outputDeviceData);
	outputDeviceData = nullptr;
	aclDestroyDataBuffer(outputDataBuffer);
	outputDataBuffer = nullptr;
	aclmdlDestroyDataset(outputDataSet);
	outputDataSet = nullptr;
}

Destroy allocations.
Reset the compute device and deinitialize AscendCL when all AscendCL API calls are complete or before the process exits.
1 2 3 4 5
void DestroyResource() { aclError ret = aclrtResetDevice(deviceId); aclFinalize(); }