Secondary Memory Allocation

Memory can be allocated in either of the following ways:
  • Allocate memory as required. The memory is not split or re-allocated.
  • Allocate a large memory pool at a time and re-allocate the memory from the memory pool as required.

For memory re-allocation, use the APIs in the table to allocate memory from the memory pool. There are restrictions on the address and size of the allocated memory of each API. Therefore, you need to pay attention to restrictions during memory pool management. Otherwise, memory overwriting may occur.

For details about memory management, see Overview.

API

Function

Input/Output Buffer

acl.rt.memcpy_async

Copies memory. This API is asynchronous.

The source address and destination address passed to this call must be 64-byte aligned.

acl.rt.malloc

Allocates linear memory of the size bytes on the device and returns the allocated memory pointer through dev_ptr. The start address of the memory is 64-byte aligned. The memory allocated by this API is byte-aligned. The size bytes requested by the user are rounded up to the nearest multiple of 32 bytes, with an additional 32 bytes added. However, for huge page memory with the memory allocation granularity of 1 GB, to save huge page memory, this API only rounds the input size up to the nearest multiple of 32 bytes, but does not add an extra 32 bytes.

If you use this API to allocate a large memory block, and divide and manage the memory, each memory segment must meet the following requirements:
  • The memory size is rounded up to the nearest multiple of 32 plus 32 bytes (m = ALIGN_UP[len,32] + 32 bytes).
  • The memory start address must be 64-byte aligned (ALIGN_UP[m,64]).
NOTE:

len indicates the size of a memory segment. ALIGN_UP[len,k] indicates rounding up to a multiple of k bytes as in this formula: ((len – 1)/k + 1) x k.

acl.media.dvpp_malloc

Allocates device memory for media processing. The allocated huge memory page meets the data processing requirements (for example, the start address is 128-byte aligned).

For details, see Media Data Processing V1.

When the output of media data processing is used as the input of model inference, if you use this API to allocate a large memory block and divide and manage the memory, each memory segment must meet the following requirements:

  • The memory size is rounded up to the nearest multiple of 32 plus 32 bytes (m = ALIGN_UP[len,32] + 32 bytes).
  • The memory start address must be 128-byte aligned (ALIGN_UP[m,128]).
NOTE:

len indicates the size of a memory segment. ALIGN_UP[len,k] indicates rounding up to a multiple of k bytes as in this formula: ((len – 1)/k + 1) x k.

acl.himpi.dvpp_malloc

Allocates device memory. The requested allocation must meet the media data processing requirements (for example,128-byte alignment of the start address).

For details, see Media Data Processing V2.

When the output of media data processing is used as the input of model inference, if you use this API to allocate a large memory block and divide and manage the memory, each memory segment must meet the following requirements:

  • The memory size is rounded up to the nearest multiple of 32 plus 32 bytes (m = ALIGN_UP[len,32] + 32 bytes).
  • The memory start address must be 128-byte aligned (ALIGN_UP[m,128]).
NOTE:

len indicates the size of a memory segment. ALIGN_UP[len,k] indicates rounding up to a multiple of k bytes as in this formula: ((len – 1)/k + 1) x k.

acl.rt.malloc_host

Allocates the host memory (lock page memory) if the app is running on the host. The system ensures that the start address of the memory is 64-byte aligned.

Allocates the device memory by normal page if the app is running on the device. If 64-byte alignment is required for the start address, you need to guarantee the alignment.

If you use this API to allocate a large memory block, and divide and manage the memory, each memory segment must meet the following requirements:
  • The memory size is rounded up to the nearest multiple of 32 plus 32 bytes (m = ALIGN_UP[len,32] + 32 bytes).
  • The memory start address must be 64-byte aligned (ALIGN_UP[m,64]).
NOTE:

len indicates the size of a memory segment. ALIGN_UP[len,k] indicates rounding up to a multiple of k bytes as in this formula: ((len – 1)/k + 1) x k.

acl.rt.malloc_cached

Allocates size bytes of linear memory on the device and returns in dev_ptr a pointer address to the allocated memory. The memory allocated by this API supports cache in any scenario.

Other restrictions are the same as those of acl.rt.malloc.

The computer vision field generally involves the use of media data processing functions. Therefore, the preceding memory allocation APIs are used, and the memory start address uses 64- or 128-byte alignment. To facilitate unified management, you are advised to choose the larger alignment value, that is, 128-byte alignment.

The following describes the typical scenarios where the memory is managed by the user during media data processing. Media Data Processing V1 details the available media data processing features.
Figure 1 VDEC scenario
Figure 2 JPEGD scenario