Image Classification with ResNet-50 (Video Decoding+Synchronous Inference)

Sample Obtaining

Refer to vdec_resnet50_classification to obtain the sample.

Description

This sample shows how to classify images based on the Caffe ResNet-50 network (single input with batch size = 1).

Convert the model file of the Caffe ResNet-50 network to an .om offline model adapted to the Ascend AI Processor. In the sample, load the .om file and decode a single-frame H.265 video stream for 10 times to obtain ten YUV420SP (NV12) images, resize the ten images, and infer the images to obtain the inference results. Then, process the inference results and output the class indexes with the top confidence values and the sum of the top 5 confidence values.

During model conversion, set CSC parameters to convert YUV420SP images into RGB images to meet the input requirements of the model.

Principles

The following table lists the key functions involved in this sample.

Initialization

  • aclInit: initializes AscendCL.
  • aclFinalize: deinitializes AscendCL.

Device Management

  • aclrtSetDevice: sets the compute device.
  • aclrtGetRunMode: obtains the run mode of the software stack. The internal processing varies depending on the run mode.
  • aclrtResetDevice: resets the compute device and cleans up all resources associated with the device.

Stream Management

  • aclrtCreateStream: creates a stream.
  • aclrtDestroyStream: destroys a stream.
  • aclrtSynchronizeStream: waits for stream tasks to complete.

Memory Management

aclrtMallocHost: allocates host memory.
  • aclrtFreeHost: frees host memory.
  • aclrtMalloc: allocates device memory.
  • aclrtFree: frees device memory.

In media data processing, if you need to allocate device memory to store the input or output data, call acldvppMalloc to allocate memory and call acldvppFree to free up memory.

Data Transfer

aclrtMemcpy (used when the app runs on the host):

  • Transfers decode source data from the host to the device.
  • Transfers the inference result from the device to the host.

Data transfer is not required if your app runs in the board environment.

Media Data Processing V1

  • Video decoding

    aclvdecSendFrame: decodes the video stream into YUV420SP images.

  • Resizing

    acldvppVpcResizeAsync: resizes YUV420SP (NV12) images to 224 x 224.

Model Inference

  • aclmdlLoadFromFileWithMem: loads a model from an .om file.
  • aclmdlExecute: performs model inference.

    Before inference, use the CSC parameters in the .om file to convert a YUV420SP image into an RGB image.

  • aclmdlUnload: unloads a model.

Directory Structure

The sample directory is organized as follows:

├── caffe_model
│   ├── aipp.cfg        //Configuration file with CSC parameters, used for model conversion

├── data
│   ├── vdec_h265_1frame_rabbit_1280x720.h265            //Test image. Obtain the test image according to the guide and save it to the data directory.

├── inc
│   ├── dvpp_process.h               //Header file that declares functions related to media data processing
│   ├── model_process.h              //Header file that declares functions related to model processing
│   ├── sample_process.h              //Header file that declares functions related to resource initialization and destruction
│   ├── utils.h                       //Header file that declares common functions (such as the file reading function)
│   ├── vdec_process.h                 //Header file that declares functions related to video processing

├── src
│   ├── acl.json         //Configuration file for system initialization
│   ├── CMakeLists.txt         //Build script
│   ├── dvpp_process.cpp       //Implementation file of functions related to media data processing
│   ├── main.cpp               //Implementation file of the main function, for image classification
│   ├── model_process.cpp      //Implementation file of model processing functions
│   ├── sample_process.cpp     //Implementation file of functions related to resource initialization and destruction
│   ├── utils.cpp              //Implementation file of common functions (such as the file reading function)
│   ├── vdec_process.cpp                 //Implementation file that declares functions related to video processing

├── .project     //Project information file, including the project type, project description, and type of the target device
├── CMakeLists.txt    //Build script that calls the CMakeLists file in the src directory

App Build and Run (Ascend EP Mode)

Refer to vdec_resnet50_classification to obtain the sample. View the README file in the sample.

App Build and Run (Ascend RC Mode)

Refer to vdec_resnet50_classification to obtain the sample. View the README file in the sample.