Overview

API Differences Between Versions

In this document, the media data processing APIs in the V1 and V2 versions have the same functions such as video encoding and decoding, image encoding and decoding, and image processing. Nevertheless, these two sets of APIs must not be mixed.
- V2 has more functions than V1. For example:
  - JPEGE: The APIs in the V2 version support advanced parameter configuration, such as Huffman table configuration.
  - VENC: The APIs in the V2 version support more refined configuration of bit rate control parameters and effect tuning, such as the QP of I-/P-frames and macroblock bit rate control.
  - VDEC: The APIs in the V2 version support more refined memory control, such as the setting of the input stream buffer.
  - Video data obtaining (ISP system control, MIPI command, and VI function): supported only by the APIs in the V2 version.
  - VPSS video processing: supported only by the APIs in the V2 version.
  - Audio-related functions, including recording, playing, and volume adjustment: supported only by the APIs in the V2 version.
  - Video data display (VO function and HDMI peripheral): supported only by the APIs in the V2 version.
- V2 APIs are recommended, which guarantee continuous evolution of API functions and services in later versions.
- V1 APIs are retained for backward compatibility considerations, but will be deprecated in later versions.

Function Description

Table 1 describes the functions of media data processing V1 (Digital Vision Pre-Processing, DVPP).

**Table 1** Function description
Function	Description
Vision Preprocessing Core (VPC)	Processes images, including cropping, resizing, and format conversion. For details, see Restrictions.
JPEG Decoder (JPEGD)	Decodes .jpg, .jpeg, .JPG, and .JPEG images into YUV images. For details, see Functions and Restrictions.
JPEG Encoder (JPEGE)	Encodes YUV images into .jpg images. For details, see Functions and Restrictions.
Video Decoder (VDEC)	Decodes videos. For details, see Functions and Restrictions.
Video Encoder (VENC)	Encodes videos. For details, see Functions and Restrictions.
PNG Decoder (PNGD)	Decodes PNG images. For details, see Functions and Restrictions.

Function Support

Table 2 describes the functions of media data processing V1 supported by each product model.

The meanings of the identifiers are as follows:

√: supported

x: not supported

**Table 2** Function support
Model	VPC	JPEGD	JPEGE	PNGD	VDEC	VENC
Atlas training products	√	√	√	√	√	x
Atlas inference products	√	√	√	√	√	√
Atlas 200I/500 A2 inference products	√	√	√	√	√	√
Atlas A2 training products/Atlas A2 inference products	√	√	√	√	√	x
Atlas A3 training products/Atlas A3 inference products	√	√	√	√	√	x

Restrictions

When using the APIs described in this chapter, pay attention to the following points:

About asynchronous APIs
For the asynchronous APIs described in this section, a successful API call only indicates the success of the task delivery, regardless of the execution result. For dependent APIs, you are advised to specify the same stream for multiple APIs to ensure the task execution sequence, because tasks in the same stream are executed in accordance with the API call sequence.

When asynchronous APIs are called to decode, crop, and resize images, if tasks depend on each other, call aclrtSynchronizeStream to ensure that tasks in the stream are executed in order.

To ensure performance, it is recommended that aclrtSynchronizeStream be called once after multiple asynchronous media data processing tasks are delivered to a stream.

After an asynchronous API is called, do not destroy the allocations immediately. Call the synchronization API (for example, aclrtSynchronizeStream) to confirm that the requested tasks of the device have been completed.
About memory allocation and deallocation
1. If device memory is needed to store the input or output data before implementing the VPC, JPEGD, and JPEGE functions for media data processing, call acldvppMalloc to allocate memory and acldvppFree to free up memory. If multiple functions are used in cascade and the same memory segment needs to be reused, allocate the maximum allowed memory.
2. The memory allocated in 1 can be used for media data processing and other tasks. For example, the output of media data processing can be used as the input of model inference to implement memory reuse and reduce memory copy.
3. The address space accessed by media data processing is limited. You are advised to call the other memory allocation APIs, such as aclrtMalloc, to allocate memory for other functions (for example, model loading) to ensure sufficient memory during media data processing.

About channel requirements

Before implementing each function of media data processing, you must call APIs to create corresponding channels. For details, see Channel Creation and Destruction. Channel creation and destruction involve resource allocation and release. Repeated channel creation and destruction affect service performance. Therefore, you are advised to manage channels based on your actual scenario. For example, to process VPC images continuously, create VPC channels, wait until all VPC functions are called, and then destroy the VPC channels.

Too many channels would affect the CPU usage and memory usage of the device. For details about the number of channels, see the performance specifications in the corresponding function sections.

The following table lists the maximum number of channels for each media data processing function.

Model	Maximum Number of Channels for Each Function
Atlas inference products	The maximum number of channels for VPC is 256. JPEGD and VDEC share channels and support a maximum of 256 channels. JPEGE and VENC share channels and support a maximum of 256 channels. The maximum number of channels for PNGD is 128. In Ascend virtual instance scenarios, the number of channels is as follows. If the total number of channels is not an integer, the number is rounded down. Total number of VPC channels = (Number of allocated VPC hardware units/Total number of VPC hardware units) × 256 Total number of VDEC and JPEGD channels = (Total number of allocated VDEC and JPEGD hardware units/Total number of VDEC and JPEGD hardware units) × 256 Total number of VENC channels and JPEGE channels = (Total number of allocated VENC and JPEGE hardware units/Total number of VENC and JPEGE hardware units) × 256 Total number of PNGD channels = Allocation specification × 128 For the PNGD function, the restrictions on the number of channels are different if the following Ascend virtual instance templates are used: When the vir04_4c_dvpp template is used, the total number of channels is fixed at 128. When the vir04_3c_ndvpp template is used, the DVPP function is not used. Therefore, the total number of channels is 0.
Atlas A2 training products/Atlas A2 inference products	The maximum number of channels for VPC is 256. JPEGD and VDEC share channels and support a maximum of 256 channels. The maximum number of JPEGD decoding channels is 256, and the maximum number of VDEC decoding channels is 32. The maximum number of channels for JPEGE is 256. The maximum number of channels for PNGD is 128. In Ascend virtual instance scenarios, the number of channels is as follows. If the total number of channels is not an integer, the number is rounded down. The maximum number of channels for VPC is 256. Total number of VDEC channels = (Number of allocated VDEC hardware units/Number of VDEC hardware units) × 32. If the total number of channels is not an integer, round down the value. The number of JPEGD channels is not affected by the computing power, but the maximum number of JPEGD+VDEC channels is 256. The maximum number of channels for JPEGE is 256. Total number of PNGD channels = Allocation specification × 128 For the PNGD function, the restrictions on the number of channels are different if the following Ascend virtual instance templates are used: When the vir12_4c_32g_m, vir10_4c_16g_m, or vir10_4c_32g_m template is used, the total number of channels is fixed to 128. When the vir12_3c_32g_nm, vir10_3c_16g_nm or vir10_3c_32g_nm template is used, the DVPP function is not used. Therefore, the total number of channels is 0.
Atlas 200I/500 A2 inference products	The maximum number of channels for VPC is 128. The maximum number of channels for JPEGD is 128. The maximum number of channels for VDEC is 128. The maximum number of channels for JPEGE is 128. The maximum number of channels for VENC is 128. The maximum number of channels for PNGD is 128. In Ascend virtual instance scenarios, the number of channels is as follows. If the total number of channels is not an integer, the number is rounded down. The maximum number of channels for VPC is 128. Total number of VDEC channels = (Number of allocated VDEC hardware units/Number of VDEC hardware units) × 128. If the total number of channels is not an integer, round down the value. The number of JPEGD channels is not affected by the computing power, but the total number of JPEGD and VDEC channels cannot exceed 128. Total number of VENC channels = (Number of allocated VENC hardware units/Number of VENC hardware units) × 128. If the total number of channels is not an integer, round down the value. The number of JPEGE channels is not affected by the computing power, but the total number of JPEGE and VENC channels cannot exceed 128.
Atlas A3 training products/Atlas A3 inference products	The maximum number of channels for VPC is 256. JPEGD and VDEC share channels and support a maximum of 256 channels. The maximum number of JPEGD decoding channels is 256, and the maximum number of VDEC decoding channels is 32. The maximum number of channels for JPEGE is 256. The maximum number of channels for PNGD is 128.

To check the compute resource specifications of the Ascend virtual instances in different scenarios, run the npu-smi info -t template-info command on the server with the Ascend AI Processor installed.

Parent topic: Media Data Processing V1