Sample Overview

Sample Obtaining

Click asynchronous inference to obtain the sample.

Function Usage

This sample shows how to classify images based on the Caffe ResNet-50 network (single input with batch size = 1).

Convert the model file of the Caffe ResNet-50 network into an offline model (.om file) adapted to the Ascend AI Processor. In the sample, load the .om file and perform n (an app parameter configured by the user. It is defaulted to 4 and can be set by the --execute_times parameter.) times of asynchronous inference on two .jpg images. Then, process the obtained n inference results, and output the class indexes with the top 5 confidence values of each image.

Main APIs

The following table lists the main APIs.

A device indicates the neural-network processing unit (NPU) on the board environment. For the SoC, there is only one device.

Initialization

  • Call acl.init to initialize the configuration.
  • Call acl.finalize to deinitialize the configuration.

Device management

  • Call acl.rt.set_device to specify the compute device.
  • Call acl.rt.get_run_mode to obtain the running mode of the software stack. The internal processing process varies according to the running mode.
  • Call acl.rt.reset_device to reset the current device and reclaim the associated resources.

Stream management

  • Call acl.rt.create_stream to create a stream.
  • Call acl.rt.destroy_stream to destroy a stream.

Memory management

  • Call acl.rt.malloc_host to allocate memory on the host.
  • Call acl.rt.free_host to deallocate the memory on the host.
  • Call acl.rt.malloc to allocate device memory.
  • Call acl.rt.free to deallocate device memory.

Data transfer

If your app runs on the host, call the acl.rt.memcpy API.

  • Transfers decode source data from the host to the device.
  • Transfers the inference result from the device to the host.

Data transfer is not required if your app runs in the board environment.

Model inference

  • Call acl.mdl.load_from_file_with_mem to load a model from an .om file.
  • Create a thread (for example, t1), call acl.rt.process_report in the thread function, and the callback function (for example, CallBackFunc) is triggered after a specified period of time.
  • Call acl.rt.subscribe_report to subscribe to thread t1 for handling callback function CallBackFunc in the stream.
  • Call acl.mdl.execute_async (asynchronous API) to run model inference.
  • Call acl.rt.launch_callback to add a callback function (CallBackFunc) to be executed on the host or device to the stream task queue.
  • Call acl.rt.synchronize_stream to block the app until all tasks in the specified stream are complete.
  • Call acl.rt.unsubscribe_report to cancel the thread registration. The callback function (CallBackFunc) of the stream is not processed by the specified thread (t1).
  • Call acl.mdl.unload to unload the model after the model inference is complete.

Data postprocessing

The sample processes the model inference result and prints the class indexes with the top 1 confidence values on the terminal.

Directory Structure

The directory structure is as follows:

resnet50_async_imagenet_classification
├──scripts
│ ├── host_version.conf // Version number configuration file
│ └── testcase_300.sh // Run script
├──src
│ ├── acl_net.py // Running file
│ └── constant.py // Constant definition
├── data
│ ├── fusion_result.json // File generated after atc conversion, which records the fused operator information.
│ ├── dog1_1024_683.jpg // Test image data
│ └── dog2_1024_683.jpg // Test image data
├── caffe_model
│ ├── resnet50.caffemodel // ResNet-50 model
│ └── resnet50.prototxt // ResNet-50 network file
└── model // Directory generated after ATC conversion
│ └── resnet50.om // Model file generated after conversion
└── README_CN.md