aclrtMemMallocPolicy

typedef enum aclrtMemMallocPolicy {
    ACL_MEM_MALLOC_HUGE_FIRST,
    ACL_MEM_MALLOC_HUGE_ONLY,
    ACL_MEM_MALLOC_NORMAL_ONLY,
    ACL_MEM_MALLOC_HUGE_FIRST_P2P,
    ACL_MEM_MALLOC_HUGE_ONLY_P2P,
    ACL_MEM_MALLOC_NORMAL_ONLY_P2P,
    ACL_MEM_MALLOC_HUGE1G_ONLY, 
    ACL_MEM_MALLOC_HUGE1G_ONLY_P2P,
    ACL_MEM_TYPE_LOW_BAND_WIDTH   = 0x0100U,
    ACL_MEM_TYPE_HIGH_BAND_WIDTH  = 0x1000U,
    ACL_MEM_ACCESS_USER_SPACE_READONLY = 0x100000U,
} aclrtMemMallocPolicy;

Both a single enumeration item and a bitwise-OR combination of multiple enumeration items are supported.

  • Configure a single enumeration item:
    • If ACL_MEM_TYPE_LOW_BAND_WIDTH or ACL_MEM_TYPE_HIGH_BAND_WIDTH is configured, the system uses ACL_MEM_MALLOC_HUGE_FIRST by default and preferentially allocates huge pages.
    • If a value other than ACL_MEM_TYPE_LOW_BAND_WIDTH and ACL_MEM_TYPE_HIGH_BAND_WIDTH is configured, the system allocates memory from the high-bandwidth or low-bandwidth physical memory based on the hardware situation.
  • Configure multiple enumeration items using bitwise OR.

    Supports combining ACL_MEM_MALLOC_HUGE_FIRST, ACL_MEM_MALLOC_HUGE_ONLY, and ACL_MEM_MALLOC_NORMAL_ONLY with ACL_MEM_TYPE_LOW_BAND_WIDTH and ACL_MEM_TYPE_HIGH_BAND_WIDTH. For example, ACL_MEM_MALLOC_HUGE_FIRST | ACL_MEM_TYPE_HIGH_BAND_WIDTH.

Table 1 Enumeration items

Enumeration Item

Description

ACL_MEM_MALLOC_HUGE_FIRST

Allocates huge page memory with a granularity of 2 MB. If the allocated memory is not a multiple of 2 MB, it is rounded up to the nearest multiple of 2 MB.

When the allocated memory is less than or equal to 1 MB, normal page memory is allocated even when this allocation rule is used. When the allocated memory is greater than 1 MB, huge page memory is allocated first. If the huge page memory is insufficient, normal page memory is allocated.

ACL_MEM_MALLOC_HUGE_ONLY

Allocates huge page memory with a granularity of 2 MB. If the allocated memory is not a multiple of 2 MB, it is rounded up to the nearest multiple of 2 MB.

If this item is configured, only huge page memory is allocated. If the huge page memory is insufficient, an error is returned.

ACL_MEM_MALLOC_NORMAL_ONLY

Allocates only normal page memory. If the normal page memory is insufficient, an error is returned.

ACL_MEM_MALLOC_HUGE_FIRST_P2P

Allocates huge page memory with a granularity of 2 MB only in the scenario of memory copy between two devices. If the allocated memory is not a multiple of 2 MB, it is rounded up to the nearest multiple of 2 MB.

If this item is configured, huge page memory is allocated first. If the huge page memory is insufficient, the normal page memory is used.

The Atlas A3 training products/Atlas A3 inference products support this option.

The Atlas A2 training products/Atlas A2 inference products support this option.

The Atlas 200I/500 A2 inference products does not support this option.

The Atlas inference products support this option. If collective communication is involved, the communicator must be initialized before any other operations that require device memory allocation. Otherwise, the initialization may fail due to insufficient P2P memory.

The Atlas training products support this option.

ACL_MEM_MALLOC_HUGE_ONLY_P2P

Allocates huge page memory with a granularity of 2 MB only in the scenario of memory copy between two devices. If the allocated memory is not a multiple of 2 MB, it is rounded up to the nearest multiple of 2 MB.

If this item is configured, only huge page memory is allocated. If the huge page memory is insufficient, an error is returned.

The Atlas A3 training products/Atlas A3 inference products support this option.

The Atlas A2 training products/Atlas A2 inference products support this option.

The Atlas 200I/500 A2 inference products does not support this option.

The Atlas inference products support this option. If collective communication is involved, the communicator must be initialized before any other operations that require device memory allocation. Otherwise, the initialization may fail due to insufficient P2P memory.

The Atlas training products support this option.

ACL_MEM_MALLOC_NORMAL_ONLY_P2P

Allocates only normal page memory in the scenario of memory copy between two devices.

The Atlas A3 training products/Atlas A3 inference products support this option.

The Atlas A2 training products/Atlas A2 inference products support this option.

The Atlas 200I/500 A2 inference products does not support this option.

The Atlas inference products support this option. If collective communication is involved, the communicator must be initialized before any other operations that require device memory allocation. Otherwise, the initialization may fail due to insufficient P2P memory.

The Atlas training products support this option.

ACL_MEM_MALLOC_HUGE1G_ONLY

Allocates huge page memory with a granularity of 1 GB. If the allocated memory is not a multiple of 1 GB, it is rounded up to the nearest multiple of 1 GB. For example, if 1.9 GB memory is requested, 2 GB memory is actually allocated according to the round-up rule.

If this item is configured, only huge page memory is allocated. If the huge page memory is insufficient, an error is returned.

When ACL_MEM_MALLOC_HUGE_ONLY, which has a memory allocation granularity of 2 MB, is used to allocate 1 GB of huge page memory, 512 (1024/2) page tables are needed. In contrast, when ACL_MEM_MALLOC_HUGE1G_ONLY, which has a memory allocation granularity of 1 GB, is used to allocate 1 GB of huge page memory, only one page table is needed. This effectively reduces the number of page tables, expands the address range of the translation lookaside buffer (TLB) cache, and improves the discrete access performance. TLB is a hardware module in the Ascend AI Processor used for caching. It stores the mapping between the recently used virtual addresses and physical addresses.

The Atlas A3 training products/Atlas A3 inference products support this option.

The Atlas A2 training products/Atlas A2 inference products support this option.

The Atlas 200I/500 A2 inference products does not support this option.

The Atlas inference products do not support this option.

The Atlas training products do not support this option.

ACL_MEM_MALLOC_HUGE1G_ONLY_P2P

Allocates huge page memory with a granularity of 1 GB only in the scenario of memory copy between two devices. If the allocated memory is not a multiple of 1 GB, it is rounded up to the nearest multiple of 1 GB. For example, if 1.9 GB memory is requested, 2 GB memory is actually allocated according to the round-up rule.

If this item is configured, only huge page memory is allocated. If the huge page memory is insufficient, an error is returned.

When ACL_MEM_MALLOC_HUGE_ONLY_P2P, which has a memory allocation granularity of 2 MB, is used to allocate 1 GB of huge page memory, 512 (1024/2) page tables are needed. In contrast, when ACL_MEM_MALLOC_HUGE1G_ONLY_P2P, which has a memory allocation granularity of 1 GB, is used to allocate 1 GB of huge page memory, only one page table is needed. This effectively reduces the number of page tables, expands the address range of the TLB cache, and improves the discrete access performance. TLB is a hardware module in the Ascend AI Processor used for caching. It stores the mapping between the recently used virtual addresses and physical addresses.

The Atlas A3 training products/Atlas A3 inference products support this option.

The Atlas A2 training products/Atlas A2 inference products support this option.

The Atlas 200I/500 A2 inference products does not support this option.

The Atlas inference products do not support this option.

The Atlas training products do not support this option.

ACL_MEM_TYPE_LOW_BAND_WIDTH

Allocates memory from the physical memory with high bandwidth.

This item is invalid. By default, the system selects the allocation rule based on hardware-supported memory types.

ACL_MEM_TYPE_HIGH_BAND_WIDTH

Allocates memory from the physical memory with low bandwidth.

This item is invalid. By default, the system selects the allocation rule based on hardware-supported memory types.

ACL_MEM_ACCESS_USER_SPACE_READONLY

Specifies that the allocated memory is read-only in user mode. Memory modification in user mode would fail.