Creating a Configuration File for Profiling
The profiler collects profile data based on settings in a .json file that defines whether to profile data and where to store it.
- Automatic creation: This file can be automatically created. After the SERVICE_PROF_CONFIG_PATH environment variable is configured in Profiling, MindIE Motor can automatically create a .json file with the default settings.
- Manual creation: This .json configuration file can be created in any directory. The following uses the ms_service_profiler_config.json file as an example. The file format is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13
{ "enable": 1, "prof_dir": "${PATH}", "profiler_level": "INFO", "acl_task_time": 0, "acl_prof_task_time_level": "", "aclDataTypeConfig": "", "aclprofAicoreMetrics": "", "api_filter": "", "kernel_filter": "", "timelimit": 0, "domain": "" }
Option |
Description |
Required (Yes/No) |
|---|---|---|
enable |
Whether to enable profiling. The options are as follows:
|
Yes |
prof_dir |
Path for storing profile data. The value can be a custom character string. The default value is ${HOME}/.ms_server_profiler. |
No |
profiler_level |
Profiling level. The value is INFO. |
No |
host_system_usage_freq |
Frequency of profiling CPU and memory system metrics. Profiling of these metrics is disabled by default. The value is an integer ranging from 1 to 50, in Hz, indicating the number of profiling operations per second. If this parameter is set to -1, profiling of these metrics is disabled. NOTE:
Enabling this function may occupy a large amount of memory. You are advised not to modify the value. |
No |
npu_memory_usage_freq |
Frequency of profiling NPU memory usage metrics. Profiling of these metrics is disabled by default. The value is an integer ranging from 1 to 50, in Hz, indicating the number of profiling operations per second. If this parameter is set to -1, profiling of these metrics is disabled. NOTE:
Enabling this function may occupy a large amount of memory. You are advised not to modify the value. |
No |
acl_task_time |
Whether to enable profiling for operator delivery and execution durations. The options are as follows:
|
No |
acl_prof_task_time_level |
Profiling level and duration. The options are as follows:
By default, this parameter is not set, indicating that L0 data is profiled until the program execution is complete. If other invalid values are set, the default value is used. The profiling level and duration can be configured at the same time, for example, "acl_prof_task_time_level": "L1;10". |
No |
aclDataTypeConfig |
Profile data type. You can select one or more of the following macros for logic OR. Each macro indicates a type of profile data. The options are as follows: For details about the results of the following profiling items, see Profiling Description. The actual results may vary. You can configure one or more of the following profiling items at a time, for example, "aclDataTypeConfig": "ACL_PROF_ACL_API" or "aclDataTypeConfig": "ACL_PROF_ACL_API, ACL_PROF_TASK_TIME".
By default, this parameter is not set, and the system defaults to "acl_prof_task_time_level": "L0". |
No |
aclprofAicoreMetrics |
AI Core metrics to profile. The options are as follows: For details about the results of the following profiling items, see op_summary (Operator Details). The actual results may vary. Only one of the following profiling items can be configured at a time, for example, "aclprofAicoreMetrics": "ACL_AICORE_PIPE_UTILIZATION".
The default value is ACL_AICORE_PIPE_UTILIZATION. The configuration of this API takes effect only when aclDataTypeConfig is set to ACL_PROF_AICORE_METRICS. |
No |
api_filter |
Profile data filtering. You can customize the API profile data to be collected. For example, if matmul is passed, the profile data of all APIs whose name contains matmul is flushed to the drive. The value is of the string type and is case sensitive. Multiple filter criteria must be separated by semicolons (;). By default, this parameter is left blank, indicating that all data is flushed to the drive. This parameter is valid only when acl_task_time is set to 2. |
No |
kernel_filter |
Profile data filtering. You can customize the kernel profile data to be collected. For example, if matmul is passed, the profile data of all kernels whose name contains matmul is flushed to the drive. The value is of the string type and is case sensitive. Multiple filter criteria must be separated by semicolons (;). By default, this parameter is left blank, indicating that all data is flushed to the drive. This parameter is valid only when acl_task_time is set to 2. |
No |
timelimit |
Profiling duration. After this parameter is set, the profiling process automatically stops after the specified duration. The value is an integer ranging from 0 to 7200, in seconds. The default value is 0, indicating that the profiling duration is not limited. NOTE:
You are advised to set the profiling duration to at least 120s. If the profiling duration is too short, the data may not meet the requirements for generating the parsing output. In this case, an alarm is printed. |
No |
domain |
Domain to profile. Specifying domains help reduce the amount of data to profile. The input parameter is a string of case-sensitive characters separated by semicolons (;), for example, "Request; KVCache". By default, this parameter is left blank, indicating that all domains will be profiled. The existing domains are Request, KVCache, ModelExecute, BatchSchedule, Communication, and eplb_observe. If the eplb_observe domain is configured and MINDIE_ENABLE_EXPERT_HOTPOT_GATHER and MINDIE_EXPERT_HOTPOT_DUMP_PATH are enabled, the profile data contains expert hotspot information. The parsing results are used to generate an expert hotspot information heatmap. You are advised to enable the eplb_observe domain separately if expert hotspot information needs to be profiled. NOTE:
An alarm will be triggered if incomplete domain configurations result in insufficient data for parsing and generating output files. For details, see Table 1. |
No |