Overview

This section describes the command-line options used by the AOE. An option and its argument can be separated by an equal sign (=) or a space. In this section, the equal sign (=) is used as an example.

If options queried with the aoe --help command are not described in Table 1, they are reserved or applicable to other SoC versions. You do not need to pay attention to such options.

Table 1 AOE command-line options

Option

Description

Mandatory (Yes/No)

Default Value

--help or -h

Displays help information.

No

N/A

--model or -m

Sets the model file directory, including the file name.

No

N/A

--model_path

Sets the model file directory, which can store multiple model files.

No

N/A

--weight or -w

Sets the weight file directory, including the file name.

No

N/A

--job_type or -j

Sets the tuning mode.

Yes

N/A

--framework or -f

Sets the framework of the original model.

No

N/A

--input_format

Sets the input data format.

No

NCHW (Caffe and ONNX)

NHWC (TensorFlow)

--input_shape

Sets the shape of each input.

No

N/A

--dynamic_batch_size

Sets dynamic batch size profiles. Applies to the scenario where image count per inference batch is unfixed.

No

N/A

--dynamic_image_size

Sets dynamic image size profiles. Applies to the scenario where the resolution of images input for inference is unfixed.

No

N/A

--dynamic_dims

Sets dynamic dimension profiles in ND format. Applies to the scenario where the dimensions for inference are unfixed.

No

N/A

--reload

Reloads tuning after subgraph tuning is interrupted. After the current process is interrupted, if you want to continue tuning from the previous phase, run --reload to enter the reload mode.

No

N/A

--device

Specifies the device used for tuning in the operating environment.

No

N/A

--progress_bar

Enables or disables the function of displaying the tuning progress.

No

on

--singleop

Tunes one or more specified operators by configuring the operator description file.

No

N/A

--output

Sets the path of the tuned model, including the file name.

No

N/A

--output_type

Sets the output data type of a network or an output node.

No

N/A

--host_env_os

If the OS and architecture of the model compilation environment are inconsistent with those of the model operating environment, set this option to the OS type of the model operating environment.

No

N/A

--host_env_cpu

If the OS and its architecture of the model compilation environment are inconsistent with those of the model operating environment, set this option to the OS architecture of the model operating environment.

No

N/A

--aicore_num

Sets the number of AI Cores for model compilation.

No

The default value is the actual number of cores of the Ascend AI Processor.

--virtual_type

Indicates whether AOE can run on virtual devices generated on Ascend virtual instances.

Availability:

  • Atlas inference products

No

0

--out_nodes

Sets the output nodes.

No

N/A

--input_fp16_nodes

Sets the input nodes to specify as FP16 nodes.

No

N/A

--insert_op_conf

Sets the path of the insertion operator configuration file, including the file name.

No

N/A

--op_name_map

Sets the path of the custom operator (non-standard operators) mapping configuration file, including the file name.

No

N/A

--is_input_adjust_hw_layout

Sets the data type and format of the network inputs to FP16 and NC1HWC0, respectively.

No

false

--is_output_adjust_hw_layout

Sets the data type and format of the network outputs to FP16 and NC1HWC0, respectively.

No

false

--disable_reuse_memory

Enables memory reuse.

No

0

--fusion_switch_file

Sets the fusion switch configuration file directory, including the file name.

No

N/A

--enable_scope_fusion_passes

Enables specific fusion patterns during compilation.

No

N/A

--enable_single_stream

Sets whether to enable single-stream serial execution of model inference in the static shape scenario.

Streams preserve the order of a stack of asynchronous operations being executed on the device.

No

false

--enable_small_channel

Enables small channel tuning to yield performance benefits at convolutional layers with channel size ≤ 4.

No

0

--compress_weight_conf

Sets the directory of the node list configuration file to be compressed, including the file name.

No

N/A

--compression_optimize_conf

Sets the directory (including the file name) of the compression configuration file. This option is used to enable the compression optimization feature specified in the configuration file to improve network performance.

No

N/A

--buffer_optimize

Enables buffer tuning.

No

l2_optimize

--precision_mode

Sets the precision mode of a model.

No

The default arguments are as follows:

  • Inference scenario: force_fp16
  • Training scenario: allow_fp32_to_fp16
  • Atlas A2 training products/Atlas A2 inference products training scenario: must_keep_origin_dtype
  • Atlas A2 training products/Atlas A2 inference products inference scenario: force_fp16
  • Atlas A2 training products/Atlas A2 inference products training scenario: must_keep_origin_dtype
  • Atlas A2 training products/Atlas A2 inference products inference scenario: force_fp16
  • Other training scenarios: allow_fp32_to_fp16
  • Other inference scenarios: force_fp16

--op_select_implmode

Selects the operator implementation mode.

No

high_performance

--optypelist_for_implmode

Lists operator optypes.

No

N/A

--op_debug_level

Enables TBE operator debug during operator compilation.

No

0

--log

Sets the log level during tuning.

No

N/A

--tune_ops_file

Specifies the operator name or operator type in the configuration file to tune a specified operator.

No

N/A

--op_precision_mode

Sets the precision mode of an operator. You can use this option to set different precision modes for different operators.

No

N/A

--modify_mixlist

Sets the operators on the mixed precision list.

No

N/A

--keep_dtype

Keeps the computation precision of some operators unchanged during the building of the original network model.

No

N/A

--customize_dtypes

Customizes the computing precision of one or more operators during model building.

No

N/A

--tune_optimization_level

Sets the tuning mode, including the high-performance mode and normal mode.

No

O2

--Fdeeper_opat

Sets in-depth operator tuning.

No

N/A

--Fnonhomo_split

Sets non-uniform subgraph partition tuning.

No

N/A

--Fop_format

Sets operator Format tuning.

No

N/A

--sparsity

Enables global sparsity.

No

0

--op_tune_mode

Enables static kernel tuning, which generates the tuned kernel based on the input operator .json file and saves the kernel to a specified directory.

No

N/A

--op_tune_file

Specifies the path for storing the operator .json file

No

N/A

--op_tune_kernel_path

Specifies the path for storing the static kernel.

No

N/A

--soc_version

Specifies the version of the Ascend AI Processor.

No

N/A

--init_bypass

Transparently transmits compilation options that are not detectable by the AOE tuning framework and tuning services in the modeling initialization phase.

No

N/A

--build_bypass

Transparently transmits compilation options that are not detectable by the AOE tuning framework and tuning services in the model compilation phase.

No

N/A