Model Inference

Function Description

Vision SDK model inference is performed based on the specified input and model to obtain the output result. The OM format is supported for inference. It also supports the inference of ATC-built models with dynamic batches, dynamic image sizes, and dynamic dimension profiles. The input for model inference is a tensor, which is constructed by calling the APIs provided by Vision SDK. Currently, the Vision SDK Python APIs support only synchronous inference.

For details about related APIs, see Model Inference.

API Calling Process

Before model inference, prepare the input data and the model to be loaded, initialize the Model class based on the model path or memory, and call the infer API of the model class to obtain the model inference result.

Process of calling APIs of model inference is as follows:

Figure 1 Process of calling APIs of model inference

The key APIs are described as follows:

Perform global initialization by calling mx_init().
Model initialization.
Determine the model loading mode based on service requirements and choose to load the model from the file or memory. If the model file is loaded from the memory, read the model file to the memory in either of the following ways:
- Load the model from a file. You can directly input the model path to the Model API for initialization.
- Specify the loading mode through the loadType field in the ModelLoadOptV2 structure, and then pass the loading mode to the Model API. The loading mode determines whether the model is loaded from a file or memory as well as whether the memory is managed by the system or user. For details, see ModelLoadOptV2.
Call the infer API to obtain the model inference result.
Perform deinitialization by calling mx_deinit().

Sample Code

The following is a code example of key steps of functions and features, which is for reference only and cannot be directly copied for execution.

import numpy as np 
from mindx.sdk import base 
from mindx.sdk.base import Tensor, Model

def process():
    # Model inference
    # Construct an input tensor (binary input is used as an example).
    # Read the processed NumPy array binary data.
    input_array = np.load("preprocess_array.npy")  
    # Construct the input Tensor class and transfer it to the device.
    input_tensor = Tensor(input_array)  
    input_tensor.to_device(device_id)  
    # Construct a list of input tensors.
    input_tensors = [input_tensor]  
    # Model path
    model_path = "resnet50_batchsize_1.om"  
    # Initialize the Model class.
    model = Model(modelPath=model_path, deviceId=device_id)  
   # Execute inference.
    outputs = model.infer(input_tensors)

if __name__ == "__main__":
    base.mx_init()    # Initialize resources.
    process()
    base.mx_deinit()  # Deinitialize resources.

Parent topic: Development Using APIs (Python)