Sample Overview

Sample Obtaining

Click asynchronous inference to obtain the sample.

Function Usage

This sample shows how to classify images based on the Caffe ResNet-50 network (single input with batch size = 1).

Convert the model file of the Caffe ResNet-50 network into an offline model (.om file) adapted to the Ascend AI Processor. In the sample, load the .om file and perform n (an app parameter configured by the user. It is defaulted to 4 and can be set by the --execute_times parameter.) times of asynchronous inference on two .jpg images. Then, process the obtained n inference results, and output the class indexes with the top 5 confidence values of each image.

Main APIs

The following table lists the main APIs.

A device indicates the neural-network processing unit (NPU) on the board environment. For the SoC, there is only one device.

Initialization	Call acl.init to initialize the configuration. Call acl.finalize to deinitialize the configuration.
Device management	Call acl.rt.set_device to specify the compute device. Call acl.rt.get_run_mode to obtain the running mode of the software stack. The internal processing process varies according to the running mode. Call acl.rt.reset_device to reset the current device and reclaim the associated resources.
Stream management	Call acl.rt.create_stream to create a stream. Call acl.rt.destroy_stream to destroy a stream.
Memory management	Call acl.rt.malloc_host to allocate memory on the host. Call acl.rt.free_host to deallocate the memory on the host. Call acl.rt.malloc to allocate device memory. Call acl.rt.free to deallocate device memory.
Data transfer	If your app runs on the host, call the acl.rt.memcpy API. Transfers decode source data from the host to the device. Transfers the inference result from the device to the host. Data transfer is not required if your app runs in the board environment.
Model inference	Call acl.mdl.load_from_file_with_mem to load a model from an .om file. Create a thread (for example, t1), call acl.rt.process_report in the thread function, and the callback function (for example, CallBackFunc) is triggered after a specified period of time. Call acl.rt.subscribe_report to subscribe to thread t1 for handling callback function CallBackFunc in the stream. Call acl.mdl.execute_async (asynchronous API) to run model inference. Call acl.rt.launch_callback to add a callback function (CallBackFunc) to be executed on the host or device to the stream task queue. Call acl.rt.synchronize_stream to block the app until all tasks in the specified stream are complete. Call acl.rt.unsubscribe_report to cancel the thread registration. The callback function (CallBackFunc) of the stream is not processed by the specified thread (t1). Call acl.mdl.unload to unload the model after the model inference is complete.
Data postprocessing	The sample processes the model inference result and prints the class indexes with the top 1 confidence values on the terminal.

Directory Structure

The directory structure is as follows:

resnet50_async_imagenet_classification
├──scripts
│ ├── host_version.conf // Version number configuration file
│ └── testcase_300.sh // Run script
├──src
│ ├── acl_net.py // Running file
│ └── constant.py // Constant definition
├── data
│ ├── fusion_result.json // File generated after atc conversion, which records the fused operator information.
│ ├── dog1_1024_683.jpg // Test image data
│ └── dog2_1024_683.jpg // Test image data
├── caffe_model
│ ├── resnet50.caffemodel // ResNet-50 model
│ └── resnet50.prototxt // ResNet-50 network file
└── model // Directory generated after ATC conversion
│ └── resnet50.om // Model file generated after conversion
└── README_CN.md

Parent topic: Image Classification with Caffe ResNet-50 (Asynchronous Inference)