Overview

  • If media data processing is involved, pay attention to the following points when using the memory:
    1. Media data processing has higher requirements on the memory for storing the input and output data (for example, the start address of the memory must be 128-byte aligned). Therefore, the following dedicated memory allocation APIs are required:
    2. The memory allocated by the preceding APIs can be used for media data processing and other tasks. For example, the output of media data processing can be used as the input of model inference to implement memory overcommitment and reduce memory copy.
    3. Because the address space accessed by media data processing is limited, you are advised to call acl.rt.malloc, acl.rt.malloc_host, or acl.rt.malloc_cached described in section "Memory Management" to allocate memory for other functions (for example, model loading) to ensure sufficient memory during media data processing.
  • For the Atlas A2 training products/Atlas A2 inference products, if hugepage memory needs to be allocated on the device, note that in the current version, hugepage memory has been reserved in the system in case of insufficiency. Before using hugepage memory, you can call acl.rt.get_mem_info to query the idle hugepage memory (ACL_HBM_MEM_HUGE) and common memory (ACL_HBM_MEM_NORMAL).
  • In the Ascend EP mode, when an inference or training job is executed for the first time after the device is started, the AI CPU operators are migrated from the host to the device and cached on the device to improve performance. Therefore, some device memory (100 MB to 200 MB, varying depending on the chip) is occupied. If the device is restarted, the AI CPU operator buffer is released.

    This restriction applies to the following products:

    • Atlas inference products
    • Atlas 200I/500 A2 inference products
    • Atlas training products
    • Atlas A2 training products
    • Atlas A3 training products/Atlas A3 inference products