Collecting Profile Data Globally

Methods for Modifying the Training Script

Add profiling_config to the training script before initializing the NPU, to specify a tuning mode.
1
2
3
4
import npu_device as npu
npu.global_options().profiling_config.enable_profiling=True
npu.global_options().profiling_config.profiling_options = '{"output":"/tmp/profiling","task_trace":"on","training_trace":"on","aicpu":"on","fp_point":"","bp_point":"","aic_metrics":"PipeUtilization"}'
npu.open().as_default()
In the preceding command:
  • enable_profiling: whether to enable profiling.
  • profiling_options: profiling configuration options.
    • output: path for storing profile data. Create the specified directory in the training environment (container or host) in advance. The running user configured during installation must have the read and write permissions on this path. It can be either an absolute path or a relative path.
    • task_trace: task trace collection enable.
    • training_trace: iteration trace collection enable. If it is set to on, both fp_point and bp_point need to be configured.
    • aicpu: whether to collect details about the AI CPU operator, such as the operator execution time and data copy time.
    • fp_point: start point of the forward propagated operator in iteration traces. This parameter is used to record the start timestamp of forward propagation. You can leave it empty to make the system obtain the values or manually obtain them.
    • bp_point: end point of the backward propagated operator in iteration traces. This parameter is used to record the end timestamp of backward propagation. You can leave it empty to make the system obtain the values or manually obtain them.
    • aic_metrics: AI Core hardware information. The value PipeUtilization indicates the percentages of time taken by compute units and MTEs.
  • For details about profiling configuration, see Profiling.

Using Environment Variables

In addition to collecting profile data by modifying the training script, you can modify the corresponding environment variable in the startup script to enable profile data collection.

A configuration example is provided as follows:

1
2
3
4
# Enable profiling.
export PROFILING_MODE=true 
# Configure profiling configuration options.
export PROFILING_OPTIONS='{"output":"/home/HwHiAiUser/output","training_trace":"on","task_trace":"on","aicpu":"on","fp_point":"","bp_point":"","aic_metrics":"PipeUtilization"}'

For details about how to set the PROFILING_OPTIONS environment variable, see Environment Variables.

Note that the configuration item enable_profiling in the training script is prior to the environment variable PROFILING_MODE.