Profile Data Collection with Environment Variables
Data collection with environment variables applies to training/online inference of the TensorFlow framework. Unlike the collection mode using the TensorFlow framework API, the environment variable mode is to directly insert the PROFILING_OPTIONS environment variable into the training/online inference script to configure profile data collection items.
Prerequisites
- Training scenario:
- Prepare a model trained on TensorFlow 1.x and a matched dataset, and port the model to the Ascend AI Processor. For details, see "Manual Porting" or "Automated Porting" in the TensorFlow 1.15 Model Porting Guide.
- Prepare a model trained on TensorFlow 2.x and a matched dataset, and port the model to the Ascend AI Processor. For details, see "Manual Porting" in the TensorFlow 2.6.5 Model Porting Guide.
- Online inference scenario: Download a pre-trained model and prepare the online inference script.
Profile Data Collection
export PROFILING_MODE=true
export PROFILING_OPTIONS='{"output":"/tmp/profiling","training_trace":"on","task_trace":"on","fp_point":"","bp_point":"","aic_metrics":"PipeUtilization"}'
For details about PROFILING_OPTIONS, see Profiling Options.
If profiling_mode is set to true but profiling_options are not set, training_trace, task_trace, hccl, aicpu, and aic_metrics (PipeUtilization) are executed by default, and the collected data is saved to the directory where the current AI job is located. If profiling_mode is set to true and any option of profiling_options is set, the default values of profiling_options are described in Profiling Options.
Data Collection Description
After the PROFILING_OPTIONS parameter is set, parse the raw data, export the result files as visualized profile data files, and save these files in the PROF_XXX/mindstudio_profiler_output directory. For details, see Profile Data Parsing and Export (msprof Command).
The generated profile data is shown in Table 1.
Argument |
Profile Data File |
|---|---|
Automatically generated by default |
|
task_trace, task_time |
The CANN level in msprof_*.json and the api_statistic_*.csv file The Ascend Hardware level in msprof_*.json and the task_time_*.csv file The HCCL level in msprof_*.json and the hccl_statistic_*.csv file |
runtime_api |
The CANN_Runtime level in msprof_*.json and the api_statistic_*.csv file |
hccl |
The HCCL level in msprof_*.json and the hccl_statistic_*.csv file |
aicpu |
|
aic_metrics |
|
l2 |
|
msproftx |
|
sys_hardware_mem_freq |
On-chip memory read/write rate file The LLC level in msprof_*.json and the llc_read_write_*.csv file The NPU MEM level in msprof_*.json and the npu_mem_*.csv file |
llc_profiling |
- |
sys_io_sampling_freq |
|
sys_interconnection_freq |
|
dvpp_freq |
|
host_sys |
The CPU Usage level in msprof_*.json and the host_cpu_usage_*.csv file The Memory Usage level in msprof_*.json and the host_mem_usage_*.csv file |
host_sys_usage |
CPU usage of processes on the host |
host_sys_usage_freq |
- |