aclrtMallocAlign32
Description
Allocates size bytes of linear memory on the device and returns in *devPtr a pointer to the allocated memory. The allocation size is the input size rounded up to the nearest multiple of 32 bytes.
Compared with aclrtMalloc, this API only rounds the input size up to the nearest multiple of 32 bytes, but does not add an extra 32 bytes.
Restrictions
- The memory allocated by this API will not be initialized.
- This API does not perform implicit device synchronization or stream synchronization. The memory allocation result, either success or failure, is returned immediately.
- Memory allocated by the aclrtMallocAlign32 call needs to be freed by the aclrtFree call.
- Performance deterioration will be caused by the too frequent calls to aclrtMallocAlign32 and aclrtFree. You are advised to allocate or manage memory in advance to avoid unnecessary memory allocation and freeing.
- If you use this API to allocate a large memory block, and divide and manage the memory, each memory segment must meet the following requirements:
- The memory size is rounded up to the nearest multiple of 32 plus 32 bytes (m = ALIGN_UP[len,32] + 32 bytes).
- The memory start address must be 64-byte aligned (ALIGN_UP[m,64]).
len indicates the size of a memory segment. ALIGN_UP[len,k] indicates rounding up to a multiple of k bytes as in this formula: ((len – 1)/k + 1) x k.
Prototype
aclError aclrtMallocAlign32(void **devPtr, size_t size, aclrtMemMallocPolicy policy)
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
devPtr |
Output |
Pointer to the pointer to the allocated device memory. |
size |
Input |
Requested allocation size in bytes Must not be 0. |
policy |
Input |
Memory allocation policy. |
Returns
The value 0 indicates success, and other values indicate failure. For details, see aclError.