Sample Overview

Sample Obtaining

Visit the Ascend samples repository on Gitee and download the sample package that matches your CANN version in use. For the version mapping, see "Release Notes" in the README file. Find the "vdec_resnet50_classification" sample in the python/level2_simple_inference/1_classification/vdec_resnet50_classification directory.

Function Description

This sample shows how to classify images based on the Caffe ResNet-50 network (single input with batch size = 1).

Convert the model file of the Caffe ResNet-50 network into an offline model (.om file) adapted to the Ascend AI Processor. Load the .om file, decode an H.265 video stream (containing only one frame) cyclically for 10 times to obtain 10 YUV420SP NV12 images, resize the 10 images, and perform inference on the resized images. Process the inference results and output the class indexes with the top confidence values and the sum of the top 5 confidence values.

During model conversion, set CSC parameters to convert YUV420SP images into RGB images to meet the input requirements of the model.

Main APIs

Table 1 shows the main APIs.

**Table 1** Main APIs
Function	ACL Module	ACL Interface Function	Description
Resource initialization	Initialization	acl.init	Initializes the ACL configuration.
	Device management	acl.rt.set_device	Specifies the device for computation.
	Context management	acl.rt.create_context	Creates a context.
	Stream management	acl.rt.create_stream	Creates a stream.
Model initialization	Model loading and execution	acl.mdl.load_from_file	Loads the model from the .om file to the device.
	Data types and operation APIs	acl.mdl.create_desc	Creates data for describing the model.
	Data types and operation APIs	acl.mdl.get_desc	Obtains data for describing the model.
Data preprocessing	Media data module	acl.media.vdec_send_frame	Functions as a video decoding API.
	Data types and operation APIs	acl.media.vdec_set_channel_desc series	Sets the description of a video processing channel.
	Data types and operation APIs	acl.media.dvpp_vpc_resize_async	Resizes the input image to the size of the output image.
	Data types and operation APIs	acl.media.dvpp_set_pic_desc series	Sets image description parameters.
Model inference	Model loading and execution	acl.mdl.execute	Performs synchronous model inference.
Data postprocessing	Data types and operation APIs	acl.op.create_attr	Creates data of the aclopAttr type.
	Data types and operation APIs	acl.create_tensor_desc	Creates data of the aclTensorDesc type.
	Data types and operation APIs	acl.get_tensor_desc_size	Obtains the size of a tensor description.
	Data types and operation APIs	acl.create_data_buffer	Creates data of the aclDataBuffer type.
Data exchange	Memory management	acl.rt.memcpy	Sends data from the host to the device or from the device to the host.
	Memory management	acl.media.dvpp_malloc	Allocates memory for media data processing on the device.
	Memory management	acl.rt.maclloc	Allocates device memory.
	Memory management	acl.rt.malloc_host	Allocates host memory.
Single-operator inference	Operator loading and execution	acl.op.execute	Loads and executes an operator. This API is asynchronous.
Common module	--	acl.util.ptr_to_numpy	Obtains the numpy.ndarray object based on the pointer address.
Common module	--	acl.util.numpy_to_ptr	Obtains the pointer address of the memory data of the numpy.ndarray object.
Allocation destruction	Memory management	acl.rt.free	Frees device memory.
	Memory management	acl.media.dvpp_free	Allocates memory by using acl.media.dvpp_malloc.
	Memory management	acl.rt.free_host	Frees host memory.
	Model loading and execution	acl.mdl.unload	Unloads a model.
	Stream management	acl.rt.destroy_stream	Destroys a stream.
	Context management	acl.rt.destroy_context	Destroys a context.
	Device management	acl.rt.reset_device	Resets the current device and reclaims the resources on the device.
	Deinitialization	acl.finalize	Deinitializes ACL.

Video Decoding and Model Inference Process

For details, see Figure 1.

Figure 1 Video decoding and model inference process

Directory Structure

The directory structure is as follows:

vdec_resnet50_classification
├──src
│ ├── acl_dvpp.py // Image resizing implementation file
│ ├── acl_model.py // Model inference implementation file
│ ├── acl_sample.py // Running file
│ ├── acl_util.py // Implementation file of tool functions 
│ ├── acl_vdec.py // Video decoding implementation file
│ └── constant.py // Constant definition
├── data
│ └── vdec_h265_1frame_rabbit_1280x720.h265 // Video file to be processed by the user, which is obtained by the user.
├── caffe_model
│ ├── aipp.cfg
│ ├── resnet50.caffemodel // ResNet-50 model
│ └── resnet50.prototxt // ResNet-50 network file
└── model
  └── resnet50_aipp.om // Inference model

Parent topic: Image Classification with Caffe ResNet-50 (Video Decoding+Synchronous Inference)