Model Inference

Function Description

Vision SDK model inference is performed based on the specified input and model to obtain the output result. The OM and MindIR format are supported for inference. It also supports the inference of ATC-built models with dynamic batches, dynamic image sizes, and dynamic dimension profiles.

For details about related APIs, see Model.

API Calling Process

Before model inference, prepare the input data and the model to be loaded, initialize the Model class based on the model path or memory, and call the Infer API of the model class to obtain the model inference result. The input data type and format must be the same as those of the model input data. If you allocate the output data memory by yourself, the output data type and format must be the same as those of the model output. The model input and output information can be queried by calling related APIs of the Model class.

Process of calling APIs of model inference is as follows:

Figure 1 Process of calling APIs of model inference

The key APIs are described as follows:

  1. Perform global initialization by calling MxInit().
  2. Initialize the model.

    Load the model using either of the following methods based on your service requirements:

    • Load the model from a file by directly inputting the model path to the Model API for initialization.
    • Specify the loading mode through the loadType field in the ModelLoadOptV2 structure, and then pass the loading mode to the Model API. The loading mode determines whether the model is loaded from a file or memory as well as whether the memory is managed by the system or user. For details, see ModelLoadOptV2.
  3. Select the synchronous or asynchronous inference based on the service requirements.
    • Synchronous inference

      Determine how to obtain the output data. You can either construct the output data through the Infer API or construct and receive the model inference output data by yourself.

    • Asynchronous inference (for the Atlas inference product only)
      1. Create a stream. For details, see Asynchronous Invocation.
      2. Construct and receive the output data and pass the created stream.
  4. Deinitialize the initialized global resources by calling MxDeInit().

Sample Code

The following is a code example of key steps of functions and features, which is for reference only and cannot be directly copied for compilation or running.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Initialization
MxBase::MxInit();
{
    // Input image binary data prepared by the user
    std::string filePath = "./test.bin";   
    // Read the input data to the memory.
    void* dataPtr = ReadTensor(filePath);        
    // Input data type, which is the same as the model input data type.
    auto dataType = MxBase::TensorDType::INT32;     
    // Construct the input shape, which is the same as that of the model.
    std::vector<uint32_t> shape = {1, 128};       
    // Construct a tensor.
    MxBase::Tensor tensor(dataPtr, shape, dataType, 0);   
    // Construct the model input.
    std::vector<MxBase::Tensor> inputs{tensor};
    // Model path specified by the user
    std::string modelPath = "./test.om";    
    // Load the model based on the model path.
    MxBase::Model model(modelPath);           
    // Perform model inference. The outputs are the inference results.
    std::vector<MxBase::Tensor> outputs = model.Infer(inputs);  
}
// Deinitialization
MxBase::MxDeInit();
The following is an example of initialization based on the ModelLoadOptV2 structure:
1
2
3
4
MxBase::ModelLoadOptV2 mdlLoadOpt;
mdlLoadOpt.loadType = ModelLoadOptV2::LOAD_MODEL_FROM_FILE;  // Specify the model loading mode.
mdlLoadOpt.modelPath = modelPath;
MxBase::Model model(mdlLoadOpt);

Inference Using a MindIR Model

The process of using a MindIR model for inference is the same as that of using an OM model for inference. Note that before using a MindIR model for inference, you need to install the MindSpore Lite software package and set environment variables. The procedure is as follows.

Pay attention to vulnerabilities listed in the MindSpore open-source community and fix them in a timely manner.

  1. Download the MindSpore Lite software package.
    • Linux-x86_64 version: Click here.
    • Linux-AArch64 version: Click here.
  2. Upload the downloaded .tar package to the environment where the Vision SDK service is running.
  3. Decompress the .tar package.
    1
    tar -zxvf mindspore-lite-2.4.0-linux-{arch}.tar.gz --no-same-owner
    
  4. Set the environment variables.

    ARM servers:

    1
    2
    export LD_LIBRARY_PATH={path}/runtime/lib:${LD_LIBRARY_PATH}
    export LD_LIBRARY_PATH={path}/tools/converter/lib:${LD_LIBRARY_PATH}
    

    x86_64 servers:

    1
    2
    3
    export LD_LIBRARY_PATH={path}/runtime/lib:${LD_LIBRARY_PATH}
    export LD_LIBRARY_PATH={path}/tools/converter/lib:${LD_LIBRARY_PATH}
    export LD_LIBRARY_PATH={path}/runtime/third_party/dnnl:${LD_LIBRARY_PATH}
    

    {path} indicates the directory generated after the MindSpore Lite software package is decompressed. Change it as required.

  5. Check the setting of the environment variables.
    1
    echo $LD_LIBRARY_PATH