Dynamic Batch/Dynamic Image Size/Dynamic Dimension (Setting Dimension Profiles)

This section describes the key APIs, API call sequence, and sample code for the dynamic batch/dynamic image size/dynamic dimension function.

API Call Sequence

The model inference workflow in the dynamic shape input scenario is similar to that in the scenario described in Model Inference. Both workflows include AscendCL initialization and deinitialization, runtime resource allocation and deallocation, model building, model loading, model execution, and model unloading.

This section focuses on the differences between the two scenarios.

  1. During model building: You need to set the dynamic batch size, dynamic image size, and dynamic dimensions (ND format).

    If the dynamic batch size feature is involved during model inference, call the AscendCL API to set the batch size. The batch size supported by the model has been configured during model building, for example, using the dynamic_batch_size parameter of ATC. For details about the parameter description, see --dynamic_batch_size in ATC Instructions.

    If the dynamic image size feature is involved during model inference, call the AscendCL API to set the image size. The image size supported by the model has been configured during model building, for example, the dynamic_image_size parameter of ATC is used during model building. For details about the parameters, see --dynamic_image_size in ATC Instructions.

    If the dynamic dimensions (ND format only) feature is involved during model inference, call the AscendCL API to set the dimensions. The dimensions supported by the model have been configured during model building, for example, the dynamic_dims parameter of ATC is used during model building. For details about the parameters, see --dynamic_dims in ATC Instructions.

    After model building, the inputs of dynamic batch size, dynamic image size, and dynamic dimensions are added to the generated OM model. During model inference, the input values are provided.

    Assume that the batch size of input a is dynamic. In the generated OM model, input b is added to describe the batch size of input a. To proceed to run your model, prepare the data structure of input a (see Preparing Input/Output Data Structure for Model Execution for details), prepare the data structure of input b, and set the data of input b (see 2 for details).

  2. Before model inference
    • Prepare the data structures of the inputs of dynamic batch size, dynamic image size, and dynamic dimensions.
      1. Before allocating memory for the inputs of dynamic batch size, dynamic image size, and dynamic dimensions, call aclmdlGetInputIndexByName to obtain the index of the input in the model based on the input name (the input name is fixed to ACL_DYNAMIC_TENSOR_NAME).
        ACL_DYNAMIC_TENSOR_NAME is a macro defined as follows:
        #define ACL_DYNAMIC_TENSOR_NAME "ascend_mbatch_shape_data"
      2. Pass the input index to the aclmdlGetInputSizeByIndex call to obtain the input buffer size.
      3. Pass the size obtained in 2.b to the aclrtMalloc call to allocate buffer.

        Do not initialize the buffer manually; otherwise, the service will be unavailable. After the API calls described in 2.b, the system automatically initializes the buffer.

      4. Call aclCreateDataBuffer to create data of the aclDataBuffer type to store the buffer address and buffer size of the inputs of dynamic batch size, image size, and dimensions.
      5. Call aclmdlCreateDataset to create data of type aclmdlDataset, and call aclmdlAddDatasetBuffer to add data of type aclDataBuffer to data of type aclmdlDataset.
    • Set the dynamic batch size, dynamic image size, and dynamic dimensions.
      Figure 1 API call sequence
      1. Call aclmdlGetInputIndexByName to obtain the index of the runtime shape or dynamic AIPP input using the input name (fixed at ACL_DYNAMIC_TENSOR_NAME).
      2. Set the dynamic batch size, dynamic image size, and dynamic dimensions.
        • Call aclmdlSetDynamicBatchSize to set the runtime batch size.

          The configured batch size must be among the batch size profiles set during model building.

          aclmdlGetDynamicBatch can be called to obtain the batch size profiles supported by the model.

        • Call aclmdlSetDynamicHWSize to set the runtime image size.

          The configured image size must be among the image size profiles set during model building.

          You can also call the aclmdlGetDynamicHW API to obtain the image size profiles supported by the model.

        • Call aclmdlSetInputDynamicDims to set the runtime dimensions.

          The configured dimensions must be among the dimension profiles set during model building.

          You can also call aclmdlGetInputDynamicDims to obtain the dimension profiles supported by the model.

Sample Code for Dynamic Batch

This section focuses on the code logic of model inference. For details about how to initialize and deinitialize AscendCL, see Initializing AscendCL. For details about how to allocate and deallocate runtime resources, see Runtime Resource Allocation and Deallocation.

Following the API calls, add exception handling branches and specify log printing of error and information levels. The following is a code snippet of key steps only, which is not ready to be built or run.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
//1. Initialize AscendCL.

//2. Allocate runtime resources.

//3. Load the model and then set the runtime batch size.
// ......

//4. Prepare the model description modelDesc_, and the model inputs input_ and model outputs output_.
// ......

//5. Customize a function to set the runtime batch size.
int  ModelSetDynamicInfo()
{
        size_t index;
       //3.1 Obtain the indexes the inputs with dynamic batch size. The input name is fixed to ACL_DYNAMIC_TENSOR_NAME.
        aclError ret = aclmdlGetInputIndexByName(modelDesc_, ACL_DYNAMIC_TENSOR_NAME, &index);
        //3.2 Set the batch size.
        //modelId_ indicates the ID of the model that is successfully loaded, input_ indicates data of the aclmdlDataset type, index indicates the input index of the dynamic batch input, and batchSize indicates the batch size (8 for example).
        uint64_t batchSize = 8;
        ret = aclmdlSetDynamicBatchSize(modelId_, input_, index, batchSize);
        // ......
}

//6. Customize a function to execute the model.
int ModelExecute(int index)
{
        aclError ret;
        //4.1 Call the user-defined function to set the runtime batch size.
	ret = ModelSetDynamicInfo();
        //4.2 Execute the model. modelId_ indicates the ID of a successfully loaded model, input_ indicates the model inputs, and output_ indicates the model outputs.
        ret = aclmdlExecute(modelId_, input_, output_);
        // ......
}
//7. Process the model inference result.

//8. Deallocate runtime resources.

//9. Deinitialize the AscendCL.

// ......

Sample Code for Dynamic Image Size

This section focuses on the code logic of model inference. For details about how to initialize and deinitialize AscendCL, see Initializing AscendCL. For details about how to allocate and deallocate runtime resources, see Runtime Resource Allocation and Deallocation.

Following the API calls, add exception handling branches and specify log printing of error and information levels. The following is a code snippet of key steps only, which is not ready to be built or run.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
//1. Initialize AscendCL.

//2. Allocate runtime resources.

//3. Load the model and then set the runtime image size.
// ......

//4. Prepare the model description modelDesc_, and the model inputs input_ and model outputs output_.
// ......

//5. Customize a function to set the runtime image size.
int  ModelSetDynamicInfo()
{
		size_t index;
                //3.1 Obtain the indexes of the inputs with dynamic image size. The input name is fixed to ACL_DYNAMIC_TENSOR_NAME.
		aclError ret = aclmdlGetInputIndexByName(modelDesc_, ACL_DYNAMIC_TENSOR_NAME, &index);
                //3.2 Set the image size. modelId_ indicates the ID of a successfully loaded model, input_ indicates data of type aclmdlDataset, and index indicates the index of the input with dynamic image size.
                uint64_t height = 224;
		uint64_t width = 224;
		ret = aclmdlSetDynamicHWSize(modelId_, input_, index, height, width);
                // ......
}

//6. Customize a function to execute the model.
int ModelExecute(int index)
{
        aclError ret;
        //4.1 Call the user-defined function to set the runtime image size.
	ret = ModelSetDynamicInfo();
        //4.2 Execute the model. modelId_ indicates the ID of a successfully loaded model, input_ indicates the model inputs, and output_ indicates the model outputs.
        ret = aclmdlExecute(modelId_, input_, output_);
        // ......
}

//7. Process the model inference result.

//8. Deallocate runtime resources.

//9. Deinitialize the AscendCL.

// ......

Sample Code for Dynamic Dimensions (ND Format Only)

This section focuses on the code logic of model inference. For details about how to initialize and deinitialize AscendCL, see Initializing AscendCL. For details about how to allocate and deallocate runtime resources, see Runtime Resource Allocation and Deallocation.

Following the API calls, add exception handling branches and specify log printing of error and information levels. The following is a code snippet of key steps only, which is not ready to be built or run.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
//1. Initialize AscendCL.

//2. Allocate runtime resources.

//3. Load the model and then set the runtime dimensions.
// ......

//4. Prepare the model description modelDesc_, and the model inputs input_ and model outputs output_.
// ......

//5. Customize a function to set the runtime dimensions.
int  ModelSetDynamicInfo()
{
        size_t index;
        //3.1 Obtain the indexes of the inputs with dynamic dimensions. The input name is fixed to ACL_DYNAMIC_TENSOR_NAME.
        aclError ret = aclmdlGetInputIndexByName(modelDesc_, ACL_DYNAMIC_TENSOR_NAME, &index);
        //3.2 Set the runtime dimensions, including the dimension count (dimCount) and the size of each dimension. modelId_ indicates the ID of a successfully loaded model, input_ indicates data of type aclmdlDataset, and index indicates the index of the input with dynamic dimensions.
        aclmdlIODims currentDims;
        currentDims.dimCount = 4;
        currentDims.dims[0] = 8;
        currentDims.dims[1] = 3;
        currentDims.dims[2] = 224;
        currentDims.dims[3] = 224;
        ret = aclmdlSetInputDynamicDims(modelId_, input_, index, &currentDims);
        // ......
}

//6. Customize a function to execute the model.
int ModelExecute(int index)
{
        aclError ret;
        //4.1 Call the user-defined function to set the runtime dimensions.
	ret = ModelSetDynamicInfo();
        //4.2 Execute the model. modelId_ indicates the ID of a successfully loaded model, input_ indicates the model inputs, and output_ indicates the model outputs.
        ret = aclmdlExecute(modelId_, input_, output_);
        // ......
}
//7. Process the model inference result.

//8. Deallocate runtime resources.

//9. Deinitialize the AscendCL.

// ......