Function: load_from_mem_with_mem

Applicability

Product

Supported (√/x)

Atlas A3 training products / Atlas A3 inference products

Atlas A2 training products / Atlas A2 inference products

Atlas training products

Atlas inference products

Atlas 200I/500 A2 inference products

Function Usage

Loads an offline model from memory. The model workspace is managed by the user.

Returns the model ID after the model is loaded. The model ID is used for model identification in subsequent operations.

Prototype

  • C Prototype
    1
    aclError aclmdlLoadFromMemWithMem(const void *model, size_t modelSize, uint32_t *modelId, void *workPtr, size_t workSize, void *weightPtr, size_t weightSize)
    
  • Python Function
    1
    model_id, ret = acl.mdl.load_from_mem_with_mem(model, model_size, work_ptr, work_size, weight_ptr, weight_size)
    

Parameter Description

Parameter

Description

model

Int, memory address of the model.
  • If the app runs on the host, allocate the memory of the host.
  • If the app runs on the device, allocate device memory.
  • Obtain the app running mode by calling acl.rt.get_run_mode.

model_size

Int, model size in bytes.

model_id

Int, model ID generated after the model is loaded.

work_ptr

Int, pointer address of the workspace (for storing model input and output data) required by the model on the device. The memory is managed by the user and cannot be freed during model execution. If 0 is passed for this parameter, the system manages the memory.

NOTE:

In the event where the memory is managed by the user, if multiple models are executed in serial, the models can share a workspace. However, users need to guarantee the serial execution sequence of the models and the workspace size (the same as the total size of the workspaces needed by all the models). Refer to the following description to ensure serial execution:

  • For synchronous model execution, add a lock to ensure that tasks are executed in serial.
  • For asynchronous model execution, use a single stream to ensure that tasks are executed in serial.

work_size

Int, workspace size required for model execution, in bytes. This parameter is invalid when work_ptr is set to 0.

weight_ptr

Int, pointer address of the model weight memory (for storing weight data) on the device. The memory is managed by the user and cannot be freed during model execution. If 0 is passed for weight_ptr, the system manages the memory.

NOTE:

When the user-managed weight memory is used, in multi-thread scenarios, if a model is loaded once in each thread, the weight_ptr sharing mode can be selected because the weight_ptr memory is read-only during inference.

Note that weight_ptr cannot be freed when the sharing is in progress.

weight_size

Int, weight memory size in bytes. This parameter is invalid when weight_ptr is set to 0.

Return Value Description

Return Value

Description

model_id

Int, model ID generated after the model is loaded.

ret

Int, error code: 0 on success; else, failure.

Restrictions

The operations of loading, executing, and unloading a model must be performed in the same context. For details about how to create a context, see acl.rt.set_device and acl.rt.create_context.

Reference

For the API call sequence, see Loading a Model.