Infer

Function Usage

Model inference API that supports the inference of ATC-built models with dynamic batches, dynamic resolutions, and dynamic dimensions.

For dynamic-shape model inference, the input tensor must meet the requirements set during model building. If the input tensor shape does not meet the profiles set during model building, a message is displayed, indicating that the dynamic batch, resolution, and dimension information fails to be set. For details about the error codes, see APP_ERROR.

For example, if the batch profile is set to 2,4,8 during dynamic batch model building and a tensor whose Batch is 1 is input, the error message "Dynamic batch set failed, modelId = 1, index = 1, dynamicBatchSize = 1" is displayed during inference.

For a model that is loaded only once, the internal resources are unique. As a result, concurrent inference cannot be performed in multiple threads. In multi-thread mode, use each thread to load the model once and then invoke the inference service.

Prototype

APP_ERROR Infer(std::vector<Tensor>& inputTensors, std::vector<Tensor>& outputTensors) 
// outputTensors is constructed by users and the tensor memory allocated by TensorMalloc() is used.

std::vector<Tensor> Infer(std::vector<Tensor>& inputTensors)
// Allocates the output memory internally and returns the output tensors after inference to users.

Parameter Description

Parameter	Input/Output	Description
inputTensors	Input	Input tensor required by the model
outputTensors	Output	Output tensor of the model

Return Parameter Description

Data Structure	Description
std::vector<Tensor>	Output tensor of the model
APP_ERROR	Error code returned during program execution. For details, see the MxBase/ErrorCode/ErrorCode.h file.

Parent topic: Model