aclgrphBuildModel Configuration Parameters
|
Parameter |
Description |
|---|---|
|
INPUT_FORMAT |
Input format. Arguments: Must be either NCHW, NHWC, or ND. Configuration example: {ge::ir_option::INPUT_FORMAT, "NHWC"}
To enable AIPP during inference, the input data must be in NHWC format. In this scenario, the data format specified by INPUT_FORMAT does not take effect.
NOTE:
This parameter applies only to the dynamic batch size, dynamic image size, and dynamic dimension scenarios. In these scenarios, INPUT_FORMAT must be consistent with the format of each Data operator. Failure to do so may result in model build failures. Applicability: |
|
INPUT_SHAPE |
Input shape. Arguments:
Configuration example:
NOTE:
Applicability: |
|
INPUT_SHAPE_RANGE |
This parameter is deprecated. Avoid using it. To specify the shape range of the input data of a model, use INPUT_SHAPE. Shape range of the input data of a model. This parameter is mutually exclusive with DYNAMIC_BATCH_SIZE, DYNAMIC_IMAGE_SIZE, and DYNAMIC_DIMS.
Applicability: |
|
OP_NAME_MAP |
Directory (including the file name) of the mapping configuration file of a custom operator. The function of a custom operator varies according to the network. You can specify the mapping between the custom operator and the actual custom operator running on the network. The directory (including the file name) can contain letters, digits, underscores (_), hyphens (-), and periods (.). Configuration example: OpA:Network1OpA Applicability: |
|
DYNAMIC_BATCH_SIZE |
Dynamic batch size profile. Applies to the scenario where the number of images processed per inference batch is unfixed. This parameter must be used together with INPUT_SHAPE and is mutually exclusive with DYNAMIC_IMAGE_SIZE or DYNAMIC_DIMS. In addition, N must be in the first place of the shape, that is, the first place of the shape must be set to -1. If N is not in the first place, use DYNAMIC_DIMS to set it. Argument: batch size profiles, for example, "1,2,4,8". Format: Enclose the specified arguments in double quotation marks (""), and separate profiles with commas (,). Restrictions: The batch size profile range is (1, 100]. At least two profiles must be set. The recommended value range for each profile is [1, 2048]. Configuration example: The value -1 of INPUT_SHAPE indicates dynamic batch size enabled. {ge::ir_option::INPUT_FORMAT, "NHWC"}
{ge::ir_option::INPUT_SHAPE, "data:-1,3,416,416"},
{ge::ir_option::DYNAMIC_BATCH_SIZE, "1,2,4,8"}
For details about the examples and precautions, see Special Topics > Dynamic BatchSize. Applicability: |
|
DYNAMIC_IMAGE_SIZE |
Dynamic image size configuration. Applies to the scenario where the resolution of images input for inference is not fixed. This parameter must be used in pair with INPUT_SHAPE and is mutually exclusive with DYNAMIC_BATCH_SIZE and DYNAMIC_DIMS. Argument: "imagesize1_height,imagesize1_width;imagesize2_height,imagesize2_width" Format: Enclose the whole argument in double quotation marks (""), and separate profiles by semicolons (;) and separate arguments within a profile by commas (,). Restrictions: The profile range is (1, 100]. That means at least two profiles must be set. Configuration example: The value -1 of INPUT_SHAPE indicates dynamic image size enabled. {ge::ir_option::INPUT_FORMAT, "NCHW"},
{ge::ir_option::INPUT_SHAPE, "data:8,3,-1,-1"},
{ge::ir_option::DYNAMIC_IMAGE_SIZE, "416,416;832,832"}
For details about the examples and precautions, see Special Topics > Dynamic Image Size. Applicability: |
|
DYNAMIC_DIMS |
Dynamic dimension profile in ND format. Applies to the scenario where the dimension size for inference is uncertain. This parameter must be used in pair with INPUT_SHAPE and is mutually exclusive with DYNAMIC_BATCH_SIZE and DYNAMIC_IMAGE_SIZE. Argument: formatted as "dim1,dim2,dim3;dim4,dim5,dim6;dim7,dim8,dim9" Format: Enclose all profiles in double quotation marks (""), and separate profiles by a semicolon (;). The dimension size values match the -1 placeholders in INPUT_SHAPE with ordering preserved, and the number of -1 placeholders equals the number of dimension sizes of each profile. Restrictions: The profile range is (1, 100]. That is, at least two profiles must be set, and a maximum of 100 profiles are supported. Three to four profiles are recommended. Configuration example: {ge::ir_option::INPUT_FORMAT, "ND"},
{ge::ir_option::INPUT_SHAPE, "data:1,-1"},
{ge::ir_option::DYNAMIC_DIMS, "4;8;16;64"}
// At model build time, the supported shape of the Data operator is 1,4; 1,8; 1,16;1,64.
{ge::ir_option::INPUT_FORMAT, "ND"},
{ge::ir_option::INPUT_SHAPE, "data:1,-1,-1"},
{ge::ir_option::DYNAMIC_DIMS, "1,2;3,4;5,6;7,8"}
// At model build time, the supported shape of the Data operator is 1,1,2; 1,3,4; 1,5,6; 1,7,8.
For details about the examples and precautions, see Special Topics > Dynamic Dimension Size. Applicability: |
|
INSERT_OP_FILE |
Path of the configuration file of the preprocessing operator, for example, Aipp operator. For details about how to use the parameter, see Special Topics > AIPP. This parameter is mutually exclusive with INPUT_FP16_NODES. The configuration file path allows only letters, digits, and underscores (_). The file name can contain letters, digits, underscores (_), and periods (.). The following is an example of the configuration file. aipp_op {
aipp_mode:static
input_format:YUV420SP_U8
csc_switch:true
var_reci_chn_0:0.00392157
var_reci_chn_1:0.00392157
var_reci_chn_2:0.00392157
}
Applicability: |
|
PRECISION_MODE |
Operator precision mode. This parameter cannot be used together with PRECISION_MODE_V2 in the same graph. You are advised to use PRECISION_MODE_V2. Arguments:
Default: force_fp16 Configuration example: {ge::ir_option::PRECISION_MODE, "force_fp16"}
Applicability: |
|
PRECISION_MODE_V2 |
Sets the precision mode of a model. This parameter cannot be used together with PRECISION_MODE in the same graph. You are advised to use PRECISION_MODE_V2. Arguments:
Default value: fp16 Configuration example: {ge::ir_option::PRECISION_MODE_V2, "fp16"}
Applicability: |
|
ALLOW_HF32 |
This parameter is reserved and is not supported in the current version. Enables the function of automatically replacing the float32 data type with the HF32 data type. In the current version, this parameter takes effect only for Conv and Matmul operators. HF32 is a single-precision floating-point type of Ascend for internal computation of operators. The following figure shows the comparison of HF32 with other common data types. HF32 shares the same value range with float32, but its mantissa precision (11 bits) is close to FP16 (10 bits). Replacing the original float32 single-precision data type with the HF32 single-precision data type by precision reduction can greatly reduce the space occupied by data and achieve performance improvement. Arguments:
Default: Enable FP32-to-HF32 conversion for Conv operators; disable FP32-to-HF32 conversion for Matmul operators. Restrictions:
Applicability: |
|
EXEC_DISABLE_REUSED_MEMORY |
Memory reuse enable. Arguments:
Configuration example: {ge::ir_option::EXEC_DISABLE_REUSED_MEMORY, "0"}
Applicability: |
|
OUTPUT_TYPE |
Network output data type. Arguments:
After the model compilation is complete, the preceding data types are displayed as DT_FLOAT, DT_UINT8, DT_INT8, or DT_FLOAT16 in the corresponding .om model file. Configuration example: {ge::ir_option::OUTPUT_TYPE, "PF32"}
Restrictions:
Applicability: |
|
INPUT_FP16_NODES |
(Required) Name of the input node that is of the float16 type. The format is "node_name1;node_name2". Enclose the specified nodes in double quotation marks ("") and separate the nodes with semicolons (;). This parameter is mutually exclusive with INSERT_OP_FILE. Configuration examples: {ge::ir_option::INPUT_FP16_NODES, "node_name1;node_name2"}
Applicability: |
|
LOG_LEVEL |
Log level. Arguments:
Configuration example: {ge::ir_option::LOG_LEVEL, "debug"}
Applicability: |
|
OP_COMPILER_CACHE_MODE |
Disk cache mode for operator build.
Arguments:
Default: enable Configuration example: {ge::ir_option::OP_COMPILER_CACHE_MODE, "enable"}
Restrictions:
Applicability: |
|
OP_COMPILER_CACHE_DIR |
Disk cache directory for operator build. Format: The directory can contain letters, digits, underscores (_), hyphens (-), and periods (.). Defaults to $HOME/atc_data. Configuration example: {ge::ir_option::OP_COMPILER_CACHE_MODE, "enable"}
{ge::ir_option::OP_COMPILER_CACHE_DIR, "/home/test/data/atc_data"}
Restrictions:
Applicability: |
|
DEBUG_DIR |
Directory of the debug-related process files generated during operator build, including the .o (operator binary file), .json (operator description file), and .cce files. Defaults to the current directory. Restrictions:
Configuration example: {ge::ir_option::OP_DEBUG_LEVEL, "1"}
{ge::ir_option::DEBUG_DIR, "/home/test/module/out_debug_info"}
Applicability: |
|
OP_DEBUG_LEVEL |
Operator debug enable.
NOTICE:
Configuration example: {ge::ir_option::OP_DEBUG_LEVEL, "1"}
Applicability: |
|
MDL_BANK_PATH |
Sets the directory of the custom repository generated after subgraph tuning. This parameter must be used in pair with BUFFER_OPTIMIZE in aclgrphBuildInitialize Configuration Parameters and takes effect only when buffer optimization is enabled to improve performance by temporarily storing data in the buffer. Argument: path of the custom repository after model tuning. Format: The value can contain letters, digits, underscores (_), hyphens (-), and periods (.). Default: $HOME/Ascend/latest/data/aoe/custom/graph/<soc_version> Configuration example: {ge::ir_option::MDL_BANK_PATH, "$HOME/custom_module_path"}
Restrictions: Priority ranked from high to low: the directory specified by MDL_BANK_PATH > the directory specified by TUNE_BANK_PATH > the default directory.
Applicability: |
|
OP_BANK_PATH |
Path of the custom repository generated after operator tuning. Format: The directory can contain letters, digits, underscores (_), hyphens (-), and periods (.). Default: ${HOME}/Ascend/latest/data/aoe/custom/op Configuration example: {ge::ir_option::OP_BANK_PATH, "$HOME/custom_tune_path"}
Restrictions: Priority ranked from high to low: the directory specified by TUNE_BANK_PATH > the directory specified by OP_BANK_PATH > the default directory of the custom repository generated after operator tuning.
Applicability: |
|
MODIFY_MIXLIST |
When mixed precision is enabled, you can use this parameter to specify the path and file name of the blocklist, trustlist, and graylist, and specify the operators that allow precision reduction and those that do not allow precision reduction. Set this parameter to the path and file name. The file is in JSON format. For the blocklist, trustlist, and graylist, you can view the value of flag in the precision_reduce option in the built-in tuning policy file ${INSTALL_DIR}/opp/built-in/op_impl/ai_core/tbe/config/<soc_version>/aic-<soc_version>-ops-info.json.
Configuration example:
{ge::ir_option::MODIFY_MIXLIST, "/home/test/ops_info.json"}
You can specify the operator type (or types separated by commas) in ops_info.json as follows. {
"black-list": { // Blocklist
"to-remove": [ // Move an operator from the blocklist to the graylist.
"Xlog1py"
],
"to-add": [ // Move an operator from the trustlist or graylist to the blocklist.
"Matmul",
"Cast"
]
},
"white-list": { // Trustlist
"to-remove": [ // Move an operator from the trustlist to the graylist.
"Conv2D"
],
"to-add": [ // Move an operator from the blocklist or graylist to the trustlist.
"Bias"
]
}
}
The operators in the preceding example configuration file are for reference only. The configuration should be based on the actual hardware environment and the built-in tuning strategies of the operators. To query the blocklist, trustlist, and graylist: "Conv2D":{
"precision_reduce":{
"flag":"true"
},
true: trustlist; false: blocklist; Not configured: graylist. Applicability: |
|
OP_PRECISION_MODE |
Sets the precision mode of one or more specified operators during internal processing. This parameter is used to transfer the customized precision mode configuration file op_precision.ini to set different precision modes for different operators. The following precision modes can be set in the configuration file:
You can view the precision or performance mode supported by an operator in the opp/built-in/op_impl/ai_core/tbe/impl_mode/all_ops_impl_mode.ini file in the file storage path with the CANN software installed. Sample: Set the precision mode based on the operator type (low priority) or node name (high priority) in each row in the .ini file. [ByOpType] optype1=high_precision optype2=high_performance optype4=support_out_of_bound_index [ByNodeName] nodename1=high_precision nodename2=high_performance nodename4=support_out_of_bound_index Restrictions:
Applicability: |
|
SHAPE_GENERALIZED_BUILD_MODE |
Sets the shape build mode during graph build. This parameter will be deprecated in later versions. Do not use this parameter for new functions.
Applicability: |
|
CUSTOMIZE_DTYPES |
Customized operator precision during model build. Other operators in the model are built according to PRECISION_MODE or PRECISION_MODE_V2. Set it to the path (including name of the configuration file), for example, /home/test/customize_dtypes.cfg. Restrictions:
The structure of the configuration file is as follows: # By operator name Opname1::InputDtype:dtype1,dtype2,…OutputDtype:dtype1,… Opname2::InputDtype:dtype1,dtype2,…OutputDtype:dtype1,… # By operator type OpType::TypeName1:InputDtype:dtype1,dtype2,…OutputDtype:dtype1,… OpType::TypeName2:InputDtype:dtype1,dtype2,…OutputDtype:dtype1,… Example: # By operator name resnet_v1_50/block1/unit_3/bottleneck_v1/Relu::InputDtype:float16,int8,OutputDtype:float16,int8 # By operator type OpType::Relu:InputDtype:float16,int8,OutputDtype:float16,int8
NOTE:
Applicability: |
|
BUILD_INNER_MODEL |
Not supported in the current version. |
|
OP_DEBUG_CONFIG |
Enable for global memory check.
The value is the path of the .cfg configuration file. Multiple options in the configuration file are separated by commas (,).
Configuration example: /root/test0.cfg. The information about the test0.cfg file is as follows: {ge::ir_option::OP_DEBUG_LEVEL, "1"}
Restrictions: During operator compilation, if you want to compile only some instead of all AI Core operators, you need to add the OP_DEBUG_LIST field to the test0.cfg configuration file. By doing so, only the operators specified in the list are compiled, based on the options configured in OP_DEBUG_CONFIG. The OP_DEBUG_LIST field has the following requirements:
Configuration example: Add the following information to the configuration file (for example, test0.cfg) specified by OP_DEBUG_CONFIG: {ge::ir_option::OP_DEBUG_CONFIG, "ccec_g,oom"}
{ge::ir_option::OP_DEBUG_LIST, "GatherV2,opType::ReduceSum"}
During model compilation, the GatherV2,ReduceSum operator is compiled based on the ccec_g and oom options.
NOTE:
Applicability: |
|
EXTERNAL_WEIGHT |
Externalizes the weights of the Const/Constant nodes on the network and converts the weights to FileConstant when the OM model file is generated. In the offline scenario, if the model weight is large and the environment has restrictions on the .om file size, you are advised to enable the external weight to save the weight separately to reduce the .om file size. Arguments:
Configuration example: {ge::ir_option::EXTERNAL_WEIGHT, "1"}
Restrictions:
Applicability: |
|
EXCLUDE_ENGINES |
Prevents the network model from using one or more acceleration engines. Use vertical bars (|) to separate multiple engines. The NPU integrates multiple hardware accelerators (also called acceleration engines), such as AiCore, AiVec, and AiCpu (sorted by priority). During graph compilation, an appropriate engine is selected for an operator based on the priority. Specifically, when an operator is supported by multiple engines, the one with a higher priority is selected. EXCLUDE_ENGINES can exclude engines for operators. For example, during a training process, to prevent the data preprocessing graph and the main training graph from preempting AiCore, you can configure this parameter to prevent the data preprocessing graph from using the AiCore engine. Arguments: AiCore: AI Core hardware acceleration engine AiVec: Vector Core hardware acceleration engine AiCpu: AI CPU hardware acceleration engine Configuration example: {ge::ir_option::EXCLUDE_ENGINES, "AiCore|AiVec"}
Applicability: |
|
DISTRIBUTED_CLUSTER_BUILD |
Applicable to the distributed compilation and partition of foundation models. Enables distributed compilation and partition of a foundation model. If this parameter is enabled, the generated offline model will be used for distributed deployment. 1: enabled; empty or other values: disabled. Example: {ge::ir_option::DISTRIBUTED_CLUSTER_BUILD, "1"}
Applicability: |
|
ENABLE_GRAPH_PARALLEL |
Applicable to the distributed compilation and partition of foundation models. Indicates whether to automatically partition the original model. 1: enabled; empty or other values: disabled. The automatic partition function can be enabled only after distributed build is enabled by DISTRIBUTED_CLUSTER_BUILD. The original model is automatically partitioned based on the requirements in the GRAPH_PARALLEL_OPTION_PATH file. Example: {ge::ir_option::ENABLE_GRAPH_PARALLEL, "1"}
Applicability: |
|
GRAPH_PARALLEL_OPTION_PATH |
Applicable to the distributed compilation and partition of foundation models. Specifies the path and name of the algorithm-based partitioning policy configuration file when the original foundation model is partitioned. The path of the partitioning strategy configuration file can be configured only after both DISTRIBUTED_CLUSTER_BUILD and ENABLE_GRAPH_PARALLEL are enabled. Example: {ge::ir_option::GRAPH_PARALLEL_OPTION_PATH, "./parallel_option.json"}
The specified configuration file must be in JSON format. The following is an example:
Argument description:
Applicability: |
|
MODEL_RELATION_CONFIG |
Applicable to the distributed compilation and partition of foundation models. Sets the configuration file and path that express data associations and distributed communication group relationships between multiple slice models. This parameter applies to scenarios where the original model is a slice model and the slicing model contains communication operators. This parameter takes effect only after DISTRIBUTED_CLUSTER_BUILD is enabled. Example: {ge::ir_option::MODEL_RELATION_CONFIG, "./model_relation.json"}
The configuration file must be in JSON format. The following is an example: {
"deploy_config" :[ // (Required) Mapping between the model deployment and the target deployment node.
{
"submodel_name":"submodel1.air", // File name after partition at the frontend, which must be the same as the graph name.
"deploy_device_id_list":"0:0:0" // Target device to be deployed for the model: cluster: 0 node: 0 item: 0
},
{
"submodel_name":"submodel2.air",
"deploy_device_id_list":"0:0:1"
}
],
"model_name_to_instance_id":[ // Required
{
"submodel_name":"submodel1.air", // Model ID, which is specified by users in the file. Different files correspond to different IDs.
"model_instance_id":0
},
{
"submodel_name":"submodel2.air",
"model_instance_id":1
}
],
"comm_group":[{ // Optional. If the model partitioned at the frontend contains a communication operator, this parameter indicates the communication domain information of the communication operator after the partition.
"group_name":"tp_group_name_0", // Sub-communication domain of the communication operator after model partition at the frontend.
"group_rank_list":"[0,1]" // Subrank list of the communication operator after model partition at the frontend.
}],
"rank_table":[
{
"rank_id":0, // Mapping between rank IDs and model IDs
"model_instance_id":0
},
{
"rank_id":1,
"model_instance_id":1
}
]
}
Applicability: |
|
AC_PARALLEL_ENABLE |
Whether to allow AI CPU operators and AI Core operators to run in parallel in a dynamic-shape graph. In a dynamic-shape graph, when this function is enabled, the system automatically identifies AI CPU operators that can be run in parallel with the AI Core operators in the graph. Operators of different engines are distributed to different streams to run in parallel, improving resource utilization and dynamic shape execution performance. Arguments:
Configuration example: {ge::ir_option::AC_PARALLEL_ENABLE, "1"}
Applicability: |
|
QUANT_DUMPABLE |
Collects the dump data of the quantization operator. For details, see Accuracy Improvement Suggestions for Model Inference in CANN AscendCL Application Software Development Guide (C&C++). During precision locating, if there is a model after AMCT quantization, the input and output of the quantization operators may be optimized during graph build when the model is converted to an OM offline model, affecting the dump data export of the quantization operators. For example: For two quantized convolution calculations, the intermediate output is optimized to the quantized output of int8. To solve this problem, the QUANT_DUMPABLE parameter is introduced. After this parameter is enabled, the input and output of the quantization operator are not fused. The transdata operator is inserted to restore the original model format. In this way, the dump data of the quantization operator can be collected. Arguments:
Configuration example: {ge::ir_option::QUANT_DUMPABLE, "1"}
Applicability: |
|
TILING_SCHEDULE_OPTIMIZE |
Tiling offload scheduling optimization. As internal storage of the AI Core in the NPU cannot store all the input and output data of operators, the input data is tiled into different parts. The first part is transferred in, computed, and then transferred out, so does the next part. This process is called tiling. Then, a computation program, called tiling implementation, determines tiling parameters (such as the block size transferred each time and the total number of cycles) based on operator information such as shape. The AI Core is not good at scalar computation in the tiling implementation. Therefore, tiling implementation is generally executed on the CPU on the host. However, tiling implementation is executed on the device when the following conditions are met:
Arguments:
Configuration example: {ge::ir_option::TILING_SCHEDULE_OPTIMIZE, "1"}
Applicability: |
|
OPTION_EXPORT_COMPILE_STAT |
Whether to generate the result file fusion_result.json of operator fusion information (including graph fusion and UB fusion) during graph build. This parameter is reserved. This file is used to record the fusion patterns used during graph build. In the file:
Arguments:
NOTE:
Configuration example: {ge::ir_option::OPTION_EXPORT_COMPILE_STAT, "1"}
Applicability Atlas Training Series Product: supported Atlas 200/300/500 Inference Product: supported |