Secondary Memory Allocation

Memory can be allocated in either of the following ways:
  • Allocate memory as required. The memory is not split or re-allocated.
  • Allocate a large memory pool at a time and re-allocate the memory from the memory pool as required.

For memory re-allocation, use the APIs in the table to allocate memory from the memory pool. There are restrictions on the address and size of the allocated memory of each API. Therefore, you need to pay attention to restrictions during memory pool management. Otherwise, memory overwriting may occur.

For details about memory management, see Overview.

API

Function

Input/Output Buffer

acl.rt.memcpy_async

Copies memory. This API is asynchronous.

The source address and destination address passed to this call must be 64-byte aligned.

acl.rt.malloc

Allocates size bytes of linear memory on the device and returns in dev_ptr a pointer to the allocated memory. The allocation size is the input size rounded up to the nearest multiple of 32 bytes, plus 32 bytes.

If you use this API to allocate a large memory block, and divide and manage the memory, each memory segment must meet the following requirements:
  • The memory size is rounded up to the nearest multiple of 32 plus 32 bytes (m = ALIGN_UP[len,32] + 32 bytes).
  • The memory start address must be 64-byte aligned (ALIGN_UP[m,64]).
NOTE:

len indicates the size of a memory segment. ALIGN_UP[len,k] indicates rounding up to a multiple of k bytes as in this formula: ((len – 1)/k + 1) x k.

acl.media.dvpp_malloc

Allocates device memory for media processing. The allocated huge memory page meets the data processing requirements (for example, the start address is 128-byte aligned).

For details, see Media Data Processing V1.

When the output of media data processing is used as the input of model inference, if you use this API to allocate a large memory block and divide and manage the memory, each memory segment must meet the following requirements:

  • The memory size is rounded up to the nearest multiple of 32 plus 32 bytes (m = ALIGN_UP[len,32] + 32 bytes).
  • The memory start address must be 128-byte aligned (ALIGN_UP[m,128]).
NOTE:

len indicates the size of a memory segment. ALIGN_UP[len,k] indicates rounding up to a multiple of k bytes as in this formula: ((len – 1)/k + 1) x k.

acl.rt.malloc_host

When the pyACL application is running on the host, this API allocates the host memory (lock page memory). The system ensures that the start address of the memory is 64-byte aligned.

If the pyACL application is running on the device, this API is called to allocate the device memory by normal page. If 64-byte alignment is required for the start address, you need to guarantee the alignment.

If you use this API to allocate a large memory block, and divide and manage the memory, each memory segment must meet the following requirements:
  • The memory size is rounded up to the nearest multiple of 32 plus 32 bytes (m = ALIGN_UP[len,32] + 32 bytes).
  • The memory start address must be 64-byte aligned (ALIGN_UP[m,64]).
NOTE:

len indicates the size of a memory segment. ALIGN_UP[len,k] indicates rounding up to a multiple of k bytes as in this formula: ((len – 1)/k + 1) x k.

acl.rt.malloc_cached

Allocates cacheable device memory. Allocates size bytes of linear memory on the device and returns in dev_ptr a pointer address to the allocated memory.

Other restrictions are the same as those of acl.rt.malloc.

The computer vision field generally involves the use of media data processing functions. Therefore, the preceding memory allocation APIs are used, and the memory start address uses 64- or 128-byte alignment. To facilitate unified management, you are advised to choose the larger alignment value, that is, 128-byte alignment.

The following describes the typical scenarios where the memory is managed by the user during media data processing. Media Data Processing V1 details the available media data processing features.
Figure 1 VDEC scenario
Figure 2 JPEGD scenario