Collecting and Flushing Profile Data

Enable profiling by calling APIs to automatically collect profile data. After the raw profile data is successfully collected, you can copy it to a development environment where the CANN Toolkit package and ops operator package are installed to parse the data, and view the visualized parsing results.

API Overview

Table 1 APIs

API

Description

acl.prof.create_config

Creates a profiling configuration. This API is used together with acl.prof.destroy_config.

acl.prof.init

Initializes profiling and sets the path for saving profile data files. This API is used together with acl.prof.finalize.

acl.prof.start

Starts profiling. This API is used together with acl.prof.stop.

acl.prof.stop

Stops profiling. This API is used together with acl.prof.start.

acl.prof.finalize

Finalizes profiling. This API is used together with acl.prof.init.

acl.prof.destroy_config

Destroys data of the aclprofConfig type created by the acl.prof.create_config call. This API is used together with acl.prof.create_config.

  • After acl.prof.init is called, all subsequent model loading data is collected, including the device, host, and timeline summary data. If only some devices are specified to collect profile data in the acl.prof.start call, analysis of profile data on other devices fails because only model loading data is available.
  • The user must have the read and write permissions on the flush directory passed to the acl.prof.init call.

API Call Examples

The following are examples of API calls:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
import acl
import numpy as np
# ......

# 1. Allocate runtime resources, including setting the compute device, creating a context, and creating a stream.
# ......

# 2. Load a model. After the model is successfully loaded, model_id that identifies the model is returned.
# ......

# 3. Create data of type aclmdlDataset to describe the inputs and outputs of the model.
# ......

# 4. Initialize profiling.
# Set the data flush path.
PROF_INIT_PATH='...'
ret = acl.prof.init(PROF_INIT_PATH)

# 5. Configure profiling.
device_list = [0]    # Set this parameter based on the device ID in the actual environment.
ACL_PROF_ACL_API = 0x0001
ACL_PROF_TASK_TIME = 0x0002
ACL_PROF_AICORE_METRICS = 0x0004
ACL_PROF_SYS_HARDWARE_MEM_FREQ = 3

# Create the pointer address of the configuration type.
prof_config = acl.prof.create_config(device_list, 0, 0, ACL_PROF_ACL_API | ACL_PROF_TASK_TIME | ACL_PROF_AICORE_METRICS)
mem_freq = "15"
ret = acl.prof.set_config(ACL_PROF_SYS_HARDWARE_MEM_FREQ, mem_freq)
ret = acl.prof.start(prof_config)

# 6. Execute the model.
ret = acl.mdl.execute(model_id, input, output)

# 7. Process the model inference result.
# ......

# 8. Destroy allocations such as the model inputs and outputs, free memory, and unload the model.
# ......

# 9. Stop profiling and destroy the configuration and related resources.
ret = acl.prof.stop(prof_config)
ret = acl.prof.destroy_config(prof_config)
ret = acl.prof.finalize()

# 10. Destroy runtime allocations.
# ......
The following lists the configuration parameter values of the APIs called in the preceding example. Select required parameters based on the actual situation.
  • acl.prof.create_config:
    1
    ACL_PROF_ACL_API | ACL_PROF_TASK_TIME | ACL_PROF_AICORE_METRICS | ACL_PROF_AICPU | ACL_PROF_L2CACHE | ACL_PROF_HCCL_TRACE | ACL_PROF_MSPROFTX | ACL_PROF_RUNTIME_API | ACL_PROF_TRAINING_TRACE
    

    For details about the parameters, see the data_type_config description of "Function: create_config".

  • acl.prof.set_config:
    1
    ACL_PROF_STORAGE_LIMIT | ACL_PROF_SYS_HARDWARE_MEM_FREQ | ACL_PROF_LLC_MODE | ACL_PROF_SYS_IO_FREQ | ACL_PROF_SYS_INTERCONNECTION_FREQ | ACL_PROF_DVPP_FREQ | ACL_PROF_HOST_SYS | ACL_PROF_HOST_SYS_USAGE | ACL_PROF_HOST_SYS_USAGE_FREQ
    

    For details about the parameters, see the config_type description of "Function: set_config".