Sample Overview

Sample Obtaining

Click video decoding+synchronous inference to obtain the sample.

Function Usage

This sample shows how to classify images based on the Caffe ResNet-50 network (single input with batch size = 1).

Convert the model file of the Caffe ResNet-50 network into an offline model (.om file) adapted to the Ascend AI Processor. Load the .om file, decode an H.265 video stream (containing only one frame) cyclically for 10 times to obtain 10 YUV420SP NV12 images, resize the 10 images, and perform inference on the resized images. Process the inference results and output the class indexes with the top confidence values and the sum of the top 5 confidence values.

During model conversion, set CSC parameters to convert YUV420SP images into RGB images to meet the input requirements of the model.

Main APIs

The following table lists the main APIs.

Initialization	Call acl.init to initialize the configuration. Call acl.finalize to deinitialize the configuration.
Device management	Call acl.rt.set_device to specify the compute device. Call acl.rt.get_run_mode to obtain the running mode of the software stack. The internal processing process varies according to the running mode. Call acl.rt.reset_device to reset the current device and reclaim the associated resources.
Stream management	Call acl.rt.create_stream to create a stream. Call acl.rt.destroy_stream to destroy a stream. Call acl.rt.synchronize_stream to block the programs until all tasks in the specified stream are complete.
Memory management	Call acl.rt.malloc_host to allocate memory on the host. Call acl.rt.free_host to deallocate the memory on the host. Call acl.rt.malloc to allocate device memory. Call acl.rt.free to deallocate device memory. If device memory is needed to store the input or output data before media data processing, call acl.media.dvpp_malloc to allocate memory and acl.media.dvpp_free to deallocate memory.
Data transfer	If your app runs on the host, call the acl.rt.memcpy API. Transfers decode source data from the host to the device. Transfers the inference result from the device to the host. Data transfer is not required if your app runs in the board environment.
Model inference	Call acl.mdl.load_from_file_with_mem to load a model from an .om file. Call acl.mdl.execute to perform model inference. Before inference, use the CSC parameters in the .om file to convert a YUV420SP image into an RGB image. Call acl.mdl.unload to unload a model.

Directory Structure

The directory structure is as follows:

vdec_resnet50_classification
├──scripts
│ ├── host_version.conf // Version number configuration file.
│ └── testcase_300.sh // Run script.
├──src
│ ├── acl_dvpp.py // Image resizing implementation file
│ ├── acl_model.py // Model inference implementation file
│ ├── acl_sample.py // Running file
│ ├── acl_util.py // Implementation file of tool functions 
│ ├── acl_vdec.py // Video decoding implementation file
│ └── constant.py // Constant definition
├── data
├── README_CN.md 
│ └── vdec_h265_1frame_rabbit_1280x720.h265 // Video file to be processed by the user, which is obtained by the user.
├── caffe_model // Model deployed by users.
│ ├── aipp.cfg // Model configuration data.
│ ├── resnet50.caffemodel // ResNet-50 model
│ └── resnet50.prototxt // ResNet-50 network file
└── model // Directory generated after the inference model is converted.
  └── resnet50_aipp.om // Model file generated after conversion.

Parent topic: Image Classification with Caffe ResNet-50 (Video Decoding+Synchronous Inference)