Tuning Result Viewing
This section introduces how to view the results of tuning once it is complete, including information displayed on the screen and the generated custom repository, .om model, operator tuning result file, and subgraph tuning result file.
If the following information is displayed, the tuning is complete and the performance is improved.
1 | <xxxx> process finished. Performance improved by xx% //xxxx indicates the tuning task name and xx% indicates the percentage of performance improvement. |
For details about the tuned custom repository and .om model, see Custom Repository and .om Model. For details about the tuning result files, see Operator Tuning Result File and Subgraph Tuning Result File.
Custom Repository and .om Model
After the tuning is complete, if the conditions for generating a custom repository are met (see Figure 2 and Figure 3), a custom repository is generated. The generated custom repository is stored as follows:
- Custom subgraph repository
If TUNE_BANK_PATH and ASCEND_CACHE_PATH are not configured, the custom repository is stored in ${HOME}/Ascend/latest/data/aoe/custom/graph/${soc_version} by default. You can run the env command to check whether they are configured.
- Custom operator repository
If TUNE_BANK_PATH and ASCEND_CACHE_PATH are not configured, the custom repository is stored in ${HOME}/Ascend/latest/data/aoe/custom/op/${soc_version} by default. You can run the env command to check whether they are configured.
- ${WORK_PATH}: tuning working directory. It is the default directory where the AOE commands are executed.
- ${model_name}: model name.
- ${timestamp}: timestamp.
- ${os}_${arch}: OS and architecture. This field is available only in the dynamic shape scenario. If the name of the .om file contains a specific OS and architecture, the model file can be used only in an operating environment with the specified OS and architecture. To use the model in operating environments with other OSs and architectures, use ATC to convert it again (with the --host_env_os and --host_env_cpu options). For details, see the ATC Instructions.
Operator Tuning Result File
The priority of the paths for storing the operator tuning result file is: ASCEND_WORK_PATH > default path (tuning working directory). To be specific, if ASCEND_WORK_PATH is not configured, this file is stored in the default path (tuning working directory). You can run the env command to check whether ASCEND_WORK_PATH is configured. For details about ASCEND_WORK_PATH, see the Environment Variables.
During tuning, the result file generated in real time is named aoe_result_opat_${timestamp}_${pidxxx}.json, which records the information about the tuned operators. ${timestamp} is in the format of YYYYMMDD_HHMMSSMS. The variable ${pidxxx} indicates the process ID.
[
{
"basic": {
"tuning_name": "Tuning task name",
"tuning_time(s)": 1494
}
},
{
"OPAT": {
"model_baseline_performance(ms)": 113.588725,
"model_performance_improvement": "0.31%",
"model_result_performance(ms)": 113.236731,
"opat_tuning_result": "tuning successful",
"repo_modified_operators": [
{
"op_name": "softmax",
"op_type": "SoftmaxV2",
"tune_performance": {
"Format": {
"performance_after_tune(us)": 99,
"performance_before_tune(us)": 134,
"performance_improvement": "35.35%",
"update_mode": "add"
}
}
},
.......
{
"op_name": "Conv_125",
"op_type": "Conv2D",
"tune_performance": {
"Schedule": {
"performance_after_tune(us)": 72.046,
"performance_before_tune(us)": 72.055,
"performance_improvement": "0.01%",
"update_mode": "add"
}
}
}
],
"repo_summary": {
"repo_add_num": 19,
"repo_hit_num": 0,
"repo_reserved_num": 0,
"repo_unsatisfied_num": 2,
"repo_update_num": 0,
"total_num": 21
}
}
}
]
If the tuning fails (tuning failed is displayed in opat_tuning_result), the op_name list of the operators that fail to be tuned is also displayed.
"tuning_failed_operators": [
"res4a_branch1"
]
Field Name |
Description |
|||
|---|---|---|---|---|
basic |
||||
- |
tuning_name |
- |
- |
Tuning task name. |
- |
tuning_time(s) |
- |
- |
Tuning duration, in seconds. This field is not recorded in tuning interruption scenarios (such as coredump and OOM). |
OPAT NOTE:
If no operator is available to be tuned, information in this segment does not exist. |
||||
- |
model_baseline_performance(ms) |
- |
- |
Model execution time before tuning, in ms. |
- |
model_performance_improvement |
- |
- |
Percentage of reduced model execution time after tuning. This field is not recorded in tuning interruption scenarios (such as coredump and OOM). |
- |
model_result_performance(ms) |
- |
- |
Model execution time after tuning, in ms. This field is not recorded in tuning interruption scenarios (such as coredump and OOM). |
- |
opat_tuning_result |
- |
- |
Tuning result, which can be "tuning successful" during a tuning success, "tuning failed" during a tuning failure, or "tune tuning incomplete" during an incomplete tuning or abnormal exit. |
- |
repo_modified_operators |
- |
- |
Details about operators whose tiling policies are added or updated after tuning. |
- |
- |
op_name |
- |
Operator name. |
- |
- |
op_type |
- |
Operator type. There can be one or more types. If there are multiple types, use [] to enclose them. |
- |
- |
tune_performance |
- |
Detailed information about operator performance improvement. |
- |
- |
Format, Schedule, or Impl |
- |
Operator tuning mode. The options are as follows:
|
- |
- |
- |
performance_after_tune(us) |
Operator execution time after tuning, in μs. |
- |
- |
- |
performance_before_tune(us) |
Operator execution time before tuning, in μs. |
- |
- |
- |
performance_improvement |
Percentage of reduced operator execution time after tuning. |
- |
- |
- |
update_mode |
Update mode of the operator tiling policies. The options are as follows:
|
NOTE:
The information from op_name to update_mode is displayed for each operator whose tiling policies are added or updated. |
||||
- |
repo_summary |
- |
- |
Information about operators in each state during tuning. |
- |
- |
repo_add_num |
- |
Number of the titling policies that are not in the repository before tuning and are added to the repository after tuning. |
- |
- |
repo_hit_num |
- |
Number of the titling policies that are in the repository during tuning. |
- |
- |
repo_reserved_num |
- |
Number of the titling policies that are in the repository before tuning and remain unchanged after tuning. |
- |
- |
repo_unsatisfied_num |
- |
Number of the titling policies that are not in the repository before tuning and are not written into the repository after tuning. |
- |
- |
repo_update_num |
- |
Number of the titling policies that are in the repository before tuning and are updated after tuning. |
- |
- |
total_num |
- |
Total number of titling policies that are tuned in the tuning task.
|
- |
tuning_failed_operators |
- |
- |
op_name list of operators that fail to be tuned. NOTE:
This field is optional. It is recorded only when the value of opat_tuning_result is tuning failed. |
Subgraph Tuning Result File
The priority of the paths for storing the subgraph tuning result file is: ASCEND_WORK_PATH > default path (tuning working directory). To be specific, if ASCEND_WORK_PATH is not configured, this file is stored in the default path (tuning working directory). You can run the env command to check whether ASCEND_WORK_PATH is configured. For details about ASCEND_WORK_PATH, see the Environment Variables.
During tuning, the result file generated in real time is named aoe_result_sgat_${timestamp}_${pidxxx}.json, which records the information about the tuned subgraphs. ${timestamp} is in the format of YYYYMMDD_HHMMSSMS. The variable ${pidxxx} indicates the process ID.
The content format is as follows. For details about the fields, see Table 2.
[
{
"basic": {
"tuning_name": "Tuning task name",
"tuning_time(s)": 78
}
},
{
"SGAT": {
"model_baseline_performance(ms)": 5.600486,
"model_performance_improvement": "55.11%",
"model_result_performance(ms)": 3.610442,
"repo_modified_subgraphs": {
"add_repo_subgraphs": [
{
"performance_after_tune(ms)": 3.573203,
"performance_before_tune(ms)": 5.58434,
"performance_improvement": "56.28%",
"repo_key": "1024942313106047484"
}
]
"update_repo_subgraphs": [
{
"performance_after_tune(ms)": 2.573203,
"performance_before_tune(ms)": 4.58434,
"performance_improvement": "78.15%",
"repo_key": "1024942313106057586"
}
]
},
"repo_summary": {
"repo_add_num": 1,
"repo_hit_num": 1,
"repo_reserved_num": 0,
"repo_unsatisfied_num": 0,
"repo_update_num": 1,
"total_num": 2
}
}
}
]
Field Name |
Description |
|||
|---|---|---|---|---|
basic |
||||
- |
tuning_name |
- |
- |
Tuning task name. |
- |
tuning_time(s) |
- |
- |
Tuning duration, in seconds. |
SGAT NOTE:
If subgraph tuning fails, information in this segment does not exist. |
||||
- |
model_baseline_performance(ms) |
- |
- |
Model execution time before tuning, in ms. |
- |
model_performance_improvement |
- |
- |
Percentage of reduced model execution time after tuning. |
- |
model_result_performance(ms) |
- |
- |
Model execution time after tuning, in ms. |
- |
repo_modified_subgraphs |
- |
- |
Details about subgraphs whose tiling policies are added or updated after tuning. |
- |
- |
add_repo_subgraphs |
- |
Subgraphs whose tiling policies are added after tuning. There can be no or multiple subgraphs. |
- |
- |
- |
performance_before_tune(ms) |
Subgraph execution time before tuning, in ms. |
- |
- |
- |
performance_after_tune(ms) |
Subgraph execution time after tuning, in ms. |
- |
- |
- |
performance_improvement |
Percentage of reduced subgraph execution time after tuning. |
- |
- |
- |
repo_key |
Subgraph key value after tuning, which is used to query the tuning repository. |
- |
- |
update_repo_subgraphs |
- |
Subgraphs whose tiling policies are updated after tuning. There can be no or multiple subgraphs. |
- |
- |
- |
performance_before_tune(ms) |
Subgraph execution time before tuning, in ms. |
- |
- |
- |
performance_after_tune(ms) |
Subgraph execution time after tuning, in ms. |
- |
- |
- |
performance_improvement |
Percentage of reduced subgraph execution time after tuning. |
- |
- |
- |
repo_key |
Subgraph key value after tuning, which is used to query the tuning repository. |
- |
repo_summary |
- |
- |
Number of subgraphs in each state during tuning. |
- |
- |
repo_add_num |
- |
Number of subgraphs whose titling policies are not in the repository before tuning and are added to the repository after tuning. |
- |
- |
repo_hit_num |
- |
Number of subgraphs whose titling policies are in the repository during tuning. |
- |
- |
repo_reserved_num |
- |
Number of subgraphs whose titling policies are in the repository before tuning and remain unchanged after tuning. |
- |
- |
repo_unsatisfied_num |
- |
Number of subgraphs whose titling policies are not in the repository before tuning and are not written into the repository after tuning. |
- |
- |
repo_update_num |
- |
Number of subgraphs whose titling policies are in the repository before tuning and are updated after tuning. |
- |
- |
total_num |
- |
Total number of subgraphs that are tuned in the tuning task.
|