Tuning Result Viewing

This section introduces how to view the results of tuning once it is complete, including information displayed on the screen and the generated custom repository, .om model, operator tuning result file, and subgraph tuning result file.

If the following information is displayed, the tuning is complete and the performance is improved.

1
<xxxx> process finished. Performance improved by xx%    //xxxx indicates the tuning task name and xx% indicates the percentage of performance improvement.

For details about the tuned custom repository and .om model, see Custom Repository and .om Model. For details about the tuning result files, see Operator Tuning Result File and Subgraph Tuning Result File.

Custom Repository and .om Model

After the tuning is complete, if the conditions for generating a custom repository are met (see Figure 2 and Figure 3), a custom repository is generated. The generated custom repository is stored as follows:

The priority of the paths for storing the custom repository is: TUNE_BANK_PATH > ASCEND_CACHE_PATH > default path. For details about TUNE_BANK_PATH and ASCEND_CACHE_PATH, see the Environment Variables.
  • Custom subgraph repository

    If TUNE_BANK_PATH and ASCEND_CACHE_PATH are not configured, the custom repository is stored in ${HOME}/Ascend/latest/data/aoe/custom/graph/${soc_version} by default. You can run the env command to check whether they are configured.

  • Custom operator repository

    If TUNE_BANK_PATH and ASCEND_CACHE_PATH are not configured, the custom repository is stored in ${HOME}/Ascend/latest/data/aoe/custom/op/${soc_version} by default. You can run the env command to check whether they are configured.

The priority of the paths for storing the tuned .om model is: ASCEND_WORK_PATH > default path. If ASCEND_WORK_PATH is not configured, the tuned .om model is stored in ${WORK_PATH}/aoe_workspace/${model_name}_${timestamp}/tunespace/result/${model_name}_${timestamp}_tune.om (or ${model_name}_${timestamp}_tune_${os}_${arch}.om) by default. You can run the env command to check whether ASCEND_WORK_PATH is configured. For details about ASCEND_WORK_PATH, see the Environment Variables. The fields are described as follows:
  • ${WORK_PATH}: tuning working directory. It is the default directory where the AOE commands are executed.
  • ${model_name}: model name.
  • ${timestamp}: timestamp.
  • ${os}_${arch}: OS and architecture. This field is available only in the dynamic shape scenario. If the name of the .om file contains a specific OS and architecture, the model file can be used only in an operating environment with the specified OS and architecture. To use the model in operating environments with other OSs and architectures, use ATC to convert it again (with the --host_env_os and --host_env_cpu options). For details, see the ATC Instructions.

Operator Tuning Result File

The priority of the paths for storing the operator tuning result file is: ASCEND_WORK_PATH > default path (tuning working directory). To be specific, if ASCEND_WORK_PATH is not configured, this file is stored in the default path (tuning working directory). You can run the env command to check whether ASCEND_WORK_PATH is configured. For details about ASCEND_WORK_PATH, see the Environment Variables.

During tuning, the result file generated in real time is named aoe_result_opat_${timestamp}_${pidxxx}.json, which records the information about the tuned operators. ${timestamp} is in the format of YYYYMMDD_HHMMSSMS. The variable ${pidxxx} indicates the process ID.

The content format is as follows. For details about the fields, see Table 1.
[
  {
    "basic": {
      "tuning_name": "Tuning task name",
      "tuning_time(s)": 1494
    }
  },
  {
    "OPAT": {
      "model_baseline_performance(ms)": 113.588725,
      "model_performance_improvement": "0.31%",
      "model_result_performance(ms)": 113.236731,
      "opat_tuning_result": "tuning successful",
      "repo_modified_operators": [
        {
          "op_name": "softmax",
          "op_type": "SoftmaxV2",
          "tune_performance": {
            "Format": {
              "performance_after_tune(us)": 99,
              "performance_before_tune(us)": 134,
              "performance_improvement": "35.35%",
              "update_mode": "add"
            }
          }
        },
       .......
        {
          "op_name": "Conv_125",
          "op_type": "Conv2D",
          "tune_performance": {
            "Schedule": {
              "performance_after_tune(us)": 72.046,
              "performance_before_tune(us)": 72.055,
              "performance_improvement": "0.01%",
              "update_mode": "add"
            }
          }
        }
      ],
      "repo_summary": {
        "repo_add_num": 19,
        "repo_hit_num": 0,
        "repo_reserved_num": 0,
        "repo_unsatisfied_num": 2,
        "repo_update_num": 0,
        "total_num": 21
      }
    }
  }
]

If the tuning fails (tuning failed is displayed in opat_tuning_result), the op_name list of the operators that fail to be tuned is also displayed.

      "tuning_failed_operators": [
        "res4a_branch1"
       ]
Table 1 Fields

Field Name

Description

basic

-

tuning_name

-

-

Tuning task name.

-

tuning_time(s)

-

-

Tuning duration, in seconds.

This field is not recorded in tuning interruption scenarios (such as coredump and OOM).

OPAT

NOTE:

If no operator is available to be tuned, information in this segment does not exist.

-

model_baseline_performance(ms)

-

-

Model execution time before tuning, in ms.

-

model_performance_improvement

-

-

Percentage of reduced model execution time after tuning.

This field is not recorded in tuning interruption scenarios (such as coredump and OOM).

-

model_result_performance(ms)

-

-

Model execution time after tuning, in ms.

This field is not recorded in tuning interruption scenarios (such as coredump and OOM).

-

opat_tuning_result

-

-

Tuning result, which can be "tuning successful" during a tuning success, "tuning failed" during a tuning failure, or "tune tuning incomplete" during an incomplete tuning or abnormal exit.

-

repo_modified_operators

-

-

Details about operators whose tiling policies are added or updated after tuning.

-

-

op_name

-

Operator name.

-

-

op_type

-

Operator type. There can be one or more types. If there are multiple types, use [] to enclose them.

-

-

tune_performance

-

Detailed information about operator performance improvement.

-

-

Format, Schedule, or Impl

-

Operator tuning mode. The options are as follows:

  • Format: This field is available only when Format is enabled during operator tuning and the performance is improved through the Format tuning.
  • Schedule: This field is available only when the performance is improved through the Schedule tuning.
  • Impl: This field is available only when the performance is improved through the Impl tuning.

-

-

-

performance_after_tune(us)

Operator execution time after tuning, in μs.

-

-

-

performance_before_tune(us)

Operator execution time before tuning, in μs.

-

-

-

performance_improvement

Percentage of reduced operator execution time after tuning.

-

-

-

update_mode

Update mode of the operator tiling policies. The options are as follows:

  • add: adds operator tiling policies.
  • update: updates operator tiling policies.
NOTE:

The information from op_name to update_mode is displayed for each operator whose tiling policies are added or updated.

-

repo_summary

-

-

Information about operators in each state during tuning.

-

-

repo_add_num

-

Number of the titling policies that are not in the repository before tuning and are added to the repository after tuning.

-

-

repo_hit_num

-

Number of the titling policies that are in the repository during tuning.

-

-

repo_reserved_num

-

Number of the titling policies that are in the repository before tuning and remain unchanged after tuning.

-

-

repo_unsatisfied_num

-

Number of the titling policies that are not in the repository before tuning and are not written into the repository after tuning.

-

-

repo_update_num

-

Number of the titling policies that are in the repository before tuning and are updated after tuning.

-

-

total_num

-

Total number of titling policies that are tuned in the tuning task.

  • repo_hit_num=repo_update_num+repo_reserved_num
  • total_num=repo_add_num+repo_hit_num+repo_unsatisfied_num

-

tuning_failed_operators

-

-

op_name list of operators that fail to be tuned.

NOTE:

This field is optional. It is recorded only when the value of opat_tuning_result is tuning failed.

Subgraph Tuning Result File

The priority of the paths for storing the subgraph tuning result file is: ASCEND_WORK_PATH > default path (tuning working directory). To be specific, if ASCEND_WORK_PATH is not configured, this file is stored in the default path (tuning working directory). You can run the env command to check whether ASCEND_WORK_PATH is configured. For details about ASCEND_WORK_PATH, see the Environment Variables.

During tuning, the result file generated in real time is named aoe_result_sgat_${timestamp}_${pidxxx}.json, which records the information about the tuned subgraphs. ${timestamp} is in the format of YYYYMMDD_HHMMSSMS. The variable ${pidxxx} indicates the process ID.

The content format is as follows. For details about the fields, see Table 2.

[
  {
    "basic": {
      "tuning_name": "Tuning task name",
      "tuning_time(s)": 78
    }
  },
  {
    "SGAT": {
      "model_baseline_performance(ms)": 5.600486,
      "model_performance_improvement": "55.11%",
      "model_result_performance(ms)": 3.610442,
      "repo_modified_subgraphs": {
        "add_repo_subgraphs": [
          {
            "performance_after_tune(ms)": 3.573203,
            "performance_before_tune(ms)": 5.58434,
            "performance_improvement": "56.28%",
            "repo_key": "1024942313106047484"
          }
        ]
        "update_repo_subgraphs": [
          {
            "performance_after_tune(ms)": 2.573203,
            "performance_before_tune(ms)": 4.58434,
            "performance_improvement": "78.15%",
            "repo_key": "1024942313106057586"
          }
        ]
      },
      "repo_summary": {
        "repo_add_num": 1,
        "repo_hit_num": 1,
        "repo_reserved_num": 0,
        "repo_unsatisfied_num": 0,
        "repo_update_num": 1,
        "total_num": 2
      }
    }
  }
]
Table 2 Fields

Field Name

Description

basic

-

tuning_name

-

-

Tuning task name.

-

tuning_time(s)

-

-

Tuning duration, in seconds.

SGAT

NOTE:

If subgraph tuning fails, information in this segment does not exist.

-

model_baseline_performance(ms)

-

-

Model execution time before tuning, in ms.

-

model_performance_improvement

-

-

Percentage of reduced model execution time after tuning.

-

model_result_performance(ms)

-

-

Model execution time after tuning, in ms.

-

repo_modified_subgraphs

-

-

Details about subgraphs whose tiling policies are added or updated after tuning.

-

-

add_repo_subgraphs

-

Subgraphs whose tiling policies are added after tuning. There can be no or multiple subgraphs.

-

-

-

performance_before_tune(ms)

Subgraph execution time before tuning, in ms.

-

-

-

performance_after_tune(ms)

Subgraph execution time after tuning, in ms.

-

-

-

performance_improvement

Percentage of reduced subgraph execution time after tuning.

-

-

-

repo_key

Subgraph key value after tuning, which is used to query the tuning repository.

-

-

update_repo_subgraphs

-

Subgraphs whose tiling policies are updated after tuning. There can be no or multiple subgraphs.

-

-

-

performance_before_tune(ms)

Subgraph execution time before tuning, in ms.

-

-

-

performance_after_tune(ms)

Subgraph execution time after tuning, in ms.

-

-

-

performance_improvement

Percentage of reduced subgraph execution time after tuning.

-

-

-

repo_key

Subgraph key value after tuning, which is used to query the tuning repository.

-

repo_summary

-

-

Number of subgraphs in each state during tuning.

-

-

repo_add_num

-

Number of subgraphs whose titling policies are not in the repository before tuning and are added to the repository after tuning.

-

-

repo_hit_num

-

Number of subgraphs whose titling policies are in the repository during tuning.

-

-

repo_reserved_num

-

Number of subgraphs whose titling policies are in the repository before tuning and remain unchanged after tuning.

-

-

repo_unsatisfied_num

-

Number of subgraphs whose titling policies are not in the repository before tuning and are not written into the repository after tuning.

-

-

repo_update_num

-

Number of subgraphs whose titling policies are in the repository before tuning and are updated after tuning.

-

-

total_num

-

Total number of subgraphs that are tuned in the tuning task.

  • repo_hit_num=repo_update_num+repo_reserved_num
  • total_num=repo_add_num+repo_hit_num+repo_unsatisfied_num