Function: malloc_align32

Applicability

Product

Supported (√/x)

Atlas A3 training products/Atlas A3 inference products

Atlas A2 training products/Atlas A2 inference products

Atlas training products

Atlas inference products

Atlas 200I/500 A2 inference products

Function Usage

Allocates size bytes of linear memory on the device and returns in dev_ptr a pointer to the allocated memory. The allocation size is the input size rounded up to the nearest multiple of 32 bytes.

Compared with acl.rt.malloc, this API only rounds the input size up to the nearest multiple of 32 bytes, without extra 32 bytes.

Prototype

  • C Prototype
    1
    aclError aclrtMallocAlign32(void **devPtr, size_t size, aclrtMemMallocPolicy policy)
    
  • Python Function
    1
    dev_ptr, ret = acl.rt.malloc_align32(size, policy)
    

Parameter Description

Parameter

Description

size

Int, allocated memory size, in bytes. Must not be 0.

policy

  • Int, memory allocation policy.

    If the configured memory allocation policy is not within the value range of aclrtMemMallocPolicy, and the size is greater than or equal to 2 MB, the huge page memory is allocated; otherwise, the common page memory is allocated.

Return Value Description

Return Value

Description

dev_ptr

Int, address of the pointer to the allocated device memory.

ret

Int, error code: 0 on success; else, failure.

Restrictions

  • Media data processing has higher requirements on the memory for storing the input and output data (for example, the start address of the memory must be 128-byte aligned). Therefore, the following dedicated memory allocation APIs are required:
  • The memory allocated by this API does not initialize the content.
  • This API does not perform implicit device synchronization or stream synchronization. The memory allocation result, either success or failure, is returned immediately.
  • If the memory is allocated by using acl.rt.malloc_align32, the memory needs to be released by calling acl.rt.free.
  • Performance deterioration will be caused by the frequent calling of acl.rt.malloc_align32 to allocate memory and acl.rt.free to free memory. You are advised to allocate or manage memory in advance to avoid frequent memory allocation and deallocation.
  • If you use this API to allocate a large memory block, and divide and manage the memory, each memory segment must meet the following requirements:
    • The memory size is rounded up to the nearest multiple of 32 plus 32 bytes (m = ALIGN_UP[len,32] + 32 bytes).
    • The memory start address must be 64-byte aligned (ALIGN_UP[m,64]).

    len indicates the size of a memory segment. ALIGN_UP[len,k] indicates rounding up to a multiple of k bytes as in this formula: ((len – 1)/k + 1) x k.