--dynamic_batch_size

Description

Sets dynamic batch size profiles. Applies to the scenario where the number of images or sentences processed per inference batch is unfixed.

In some inference scenarios, for example, where a target recognition network is executed after object detection, the batch size of the target recognition input is dynamic due to the unfixed number of objects. It would be a great waste of computing resources to run inferences based on the maximum batch size or image size. Thanks to ATC's support for dynamic batch size and dynamic image size, you can include this option and the --dynamic_image_size option in your model conversion command to set the supported batch size or image size profiles.

In the result .om model, you will find a newly added input, which provides the runtime batch size for model inference. Assume that the batch size of input a is dynamic. In the generated .om model, input b is created to describe the runtime batch size of input a.

See Also

  • Use this option in conjunction with --input_shape. This option is mutually exclusive with --dynamic_image_size, --dynamic_dims, and --input_shape_range. Only the scenario where N is in the first place of the shape is supported. That is, the first place of the shape is set to -1. If N is not in the first place, use --dynamic_dims to set it.
  • This parameter cannot be used together with --framework = 1. The *.air model does not support dynamic shape profiles.

Argument

Argument: Batch size profiles, for example, "1,2,4,8".

Format: Enclose the whole argument in double quotation marks (""), and separate the batch sizes with commas (,).

Restrictions: The batch size profile range is (1,100]. At least two profiles must be set and a maximum of 100 profiles are supported. The recommended value range for each profile is [1–2048].

Suggestions and Benefits

  • Too large batch sizes or too many batch size profiles will cause model conversion failures.
  • For computer vision (CV) networks, the recommended value of --dynamic_batch_size is 8 or 16. In this scenario, the network performance is better than that when batch size = 1.
  • For optical character recognition (OCR) or natural language processing (NLP) networks, the recommended value of --dynamic_batch_size is an integer multiple of 16.
  • In the scenario where you have set too large batch sizes or too many batch size profiles, you are advised to run the swapoff -a command to disable the use of swap space as memory to prevent slow operating environment.

Example

--input_shape="data:-1,3,416,416;img_info:-1,4"  --dynamic_batch_size="1,2,4,8"

-1 in --input_shape indicates dynamic batch size enabled. The input supports the following shape profiles at ATC build time:

Profile 0: data(1,3,416,416)+img_info(1,4)

Profile 1: data(2,3,416,416)+img_info(2,4)

Profile 2: data(4,3,416,416)+img_info(4,4)

Profile 3: data(8,3,416,416)+img_info(8,4)

Applicability

Atlas 200/300/500 Inference Product

Atlas Training Series Product

Dependencies and Restrictions

  • Option Usage:
    • Networks that contain dynamic-shape operators (middle layers of the network with unfixed shape) are not supported.
    • This option allows you to run inference with dynamic batch sizes. For example, to run inference on two, four, or eight images per batch, set this option to 2,4,8. Memory will be allocated based on the runtime batch size.
    • If you have set dynamic batch sizes as well as dynamic AIPP (by setting --insert_op_conf):

      In your inference code, call the aclmdlSetInputAIPP API provided by AscendCL to set dynamic AIPP parameters. Ensure that batchSize is set to the maximum batch size. For details about the APIs, see aclmdlSetInputAIPP.

    • The offline model generated with this option included is configured with the dynamic batch size feature, which might have structure differences from that generated without this option and therefore shows different inference performance.
  • API Usage:

    If this option is used to set the dynamic batch size during model conversion, you need to perform the following operations before calling the model execution APIs to run an application project for inference:

    • Use the aclmdlSetDynamicBatchSize API provided by AscendCL to set the runtime BatchSize.
    • If aclmdlSetDynamicBatchSize is not called, the maximum value in the batch size range is assigned by default during model execution.

    For details about the APIs, see aclmdlSetDynamicBatchSize.