Profiling with Environment Variables

Profiling with environment variables applies to training/online inference of the TensorFlow framework. Unlike the profiling mode using the TensorFlow framework API, the environment variable mode is to directly insert the PROFILING_OPTIONS environment variable into the training/online inference script to configure profiling items.

Prerequisites

  • Training scenario:
  • Online inference scenario: Download a pre-trained model and prepare the online inference script.

Procedure

The following is an example.
export PROFILING_MODE=true
export PROFILING_OPTIONS='{"output":"/tmp/profiling","training_trace":"on","task_trace":"on","fp_point":"","bp_point":"","aic_metrics":"PipeUtilization"}'

For details about PROFILING_OPTIONS, see Profiling Options.

If PROFILING_MODE is set to true but PROFILING_OPTIONS is not set, training_trace, task_trace, hccl, aicpu, and aic_metrics (PipeUtilization) are executed by default, and the profiled data is saved to the directory where the current AI task is located. If PROFILING_MODE is set to true and PROFILING_OPTIONS is set, see Profiling Options for the default settings of PROFILING_OPTIONS.

Profiling Results

After the PROFILING_OPTIONS parameter is set, parse the raw data, export the result files as visualized profile data files, and save these files in the PROF_XXX/mindstudio_profiler_output directory. For details, see Offline Parsing.

Table 1 shows the result files.

Table 1 Profiling result files

Argument

Result File

Automatically generated by default

msprof (Timeline Report)

op_summary_*.csv

op_statistic_*.csv

fusion_op_*.csv

step_trace (iteration trace data)

task_trace, task_time

The CANN layer in msprof_*.json and the api_statistic_*.csv file

The Ascend Hardware layer in msprof_*.json and the task_time_*.csv file

The Communication layer in msprof_*.json and the communication_statistic_*.csv file

step_trace_*.json

runtime_api

The CANN_Runtime layer in msprof_*.json and the api_statistic_*.csv file

hccl

The Communication layer in msprof_*.json and the communication_statistic_*.csv file

api_statistic_*.csv

aicpu

aicpu_*.csv

dp_*.csv

aic_metrics

op_summary_*.csv

l2

l2_cache_*.csv

msproftx

msproftx data

sys_hardware_mem_freq

On-chip memory read/write rate file

The LLC layer in msprof_*.json and the llc_read_write_*.csv file

The acc_pmu layer in msprof_*.json

The Stars Soc Info layer in msprof_*.json

The NPU MEM layer in msprof_*.json and the npu_mem_*.csv file

npu_module_mem_*.csv

llc_profiling

-

sys_io_sampling_freq

The NIC layer in msprof_*.json and the nic_*.csv file

The RoCE layer in msprof_*.json and the roce_*.csv file

sys_interconnection_freq

The PCIe layer in msprof_*.json and the pcie_*.csv file

The HCCS layer in msprof_*.json and the hccs_*.csv file

The Stars Chip Trans layer in msprof_*.json

dvpp_freq

dvpp_*.csv

instr_profiling_freq

biu_group, aic_core_group, and aiv_core_group levels in msprof_*.json

host_sys

The CPU Usage layer in msprof_*.json and the host_cpu_usage_*.csv file

The Memory Usage layer in msprof_*.json and the host_mem_usage_*.csv

host_sys_usage

System CPU usage on the host

CPU usage of processes on the host

System memory usage on the host

Memory usage of processes on the host

host_sys_usage_freq

-