Loading a Model

Load the model generated according to Model Building and prepare for model execution.

API Call Sequence

If network-wide model inference is involved, ensure that your app contains the code logic for model loading. For details about the API call sequence, see pyACL API Call Sequence.

This section describes the API call sequence for loading a model on the entire network. For details about loading and executing a single-operator, see Single-Operator Calling.

pyACL provides two sets of model loading APIs for you to choose from based on application scenarios.

Figure 1: Select different APIs according to different loading modes (such as loading from a file or from the memory). The operation is relatively simple, but you need to remember the loading APIs of various modes.
Figure 2: You only need to set configuration parameters in APIs for different loading modes (such as loading from a file or from the memory). This has a wider application scope, but multiple APIs need to be used together to create configuration objects, set attribute values in objects, and load models, respectively.

Figure 1 Model loading workflow (using different model loading APIs)

Figure 2 Model loading workflow (setting parameters in the model loading API)

The key APIs are described as follows:

Before loading a model, build offline model adapted to the Ascend AI Processors (.om file). For details, see Model Buildings.
If the memory is managed by the user, you need to call acl.mdl.query_size to query the sizes of the workspace and weight memory required for model execution to avoid memory waste.
If the shape of the input data is uncertain, you cannot call acl.mdl.query_size to query the memory size. As a result, you cannot manage the memory during model loading. Therefore, you need to call acl.mdl.load_from_file or acl.mdl.load_from_mem to allow the system to manage the memory.
A model can be loaded using the following APIs. A model ID is returned after the model is successfully loaded.
- When using different model loading APIs, the caller can determine whether to load the model from a file or from memory and whether the memory is managed by the system or the user:
  - acl.mdl.load_from_file: loads offline model data from a file. The memory for running the model is managed by the system.
  - acl.mdl.load_from_mem: loads offline model data from the memory. The memory for running the model is managed by the system.
  - acl.mdl.load_from_file_with_mem: loads offline model data from a file. The memory (including the workspace for storing temporary data at model runtime and weight memory for storing the weight data of the model) is managed by the user.
  - acl.mdl.load_from_mem_with_mem: loads offline model data from the memory. The memory (including workspace and weight memory) is managed by the user.
- When setting parameters in the model loading APIs, such as acl.mdl.set_config_opt and acl.mdl.load_with_config, the caller determines whether to load the model from a file or from memory and whether the memory is managed by the system or the user by setting the attributes in the configuration object.

Sample Code

After the model is loaded successfully, the ID of the model is returned, which will be used in Executing a Model.

You can view the complete code in Sample Overview.

After APIs are called, add an exception handling branch, and record error logs and warning logs. The following is a code snippet of key steps only, which is not ready to use.

      
           # Initialize variables.
model_path = "./model/resnet50.om"
# ......

# Load the offline model file (adapted to the Ascend AI Processor). The system manages the memory (including the weight memory and workspace) for running the model.
# Successful model loading returns a model ID.
model_id, ret = acl.mdl.load_from_file(model_path)

# ......

Parent topic: Model Inference