Restrictions
Permission Restrictions
- Ensure that the principle of least privilege is used. For example, other users are not allowed to write data, which is often implemented by disabling 666 and 777.
- Before using the Profiling tool, ensure that the umask value of the execution user is greater than or equal to 0027. Otherwise, the permission on the directory and file where the obtained profile data is located will be too high.
- You can check the umask value by running the umask command.
- You can change the umask value by running the umask NewValue command.
- Ensure that the profile data is stored in the current user directory that does not contain soft links. Otherwise, security problems may occur.
Execution Restrictions
- It is not allowed to initiate more than one collection task on the same device.
- You are not advised using the profile data collection function together with the dump function.
The dump operation could affect the system performance. If the data collection and dump functions are both enabled, the collected profile data metrics will be inaccurate. Therefore, disable data dump before starting data collection.
Data Flush Restrictions
- Paths starting with "~" are not recognizable.
- The recommended time for profile data sampling is within 5 minutes, and at least 20 times the size of the raw profile data should be reserved for the memory and drive space. The size of raw data refers to the total amount of data in the data directory after it has been collected and stored.
- With all collection items enabled, to collect profile data with a single data collection task and flush the collected data to a disk, pay attention to the disk read/write speed.
- If your inference job runs on a single device, the disk read/write speed must be greater than or equal to 50 MB/s.
- If your training job runs on a single device, the disk read/write speed must be greater than or equal to 60 MB/s.
- If your job runs on multiple devices, the disk read/write speed must be greater than or equal to that of a single device multiplied by the number of devices.
- Profile data cannot be flushed to a disk if the disk is full during data collection. Therefore, it is necessary to leave enough space on the disk.
Atlas 200/300/500 Inference Product : The flushed raw profile data needs to be manually cleared to prevent the disk space from being used up.Atlas Training Series Product : You can configure --storage-limit to prevent the disk space from being used up by the profile data.
- During profile data parsing, if the configured disk or user directory space is full, the parsing fails or the file cannot be flushed to the disk. In this case, you need to clear the disk or user directory space.
Compatibility and Scenario Restrictions
- Python 3.7.5 or later is required.
- The development of an application project must comply with the CANN AscendCL Application Software Development Guide (C&C++). To obtain complete profile data, call aclInit() to initialize AscendCL and call aclFinalize() to deinitialize AscendCL.
If the application has called aclInit() but not aclFinalize(), resulting in an abnormal termination of the collection process, the collected data will be incomplete. The data sampled within the last one second may be lost due to delayed flushing, but the lost data is no more than 2 MB, and it does not affect the analysis of the flushed profile data.
- During collection of profile data using the msprof command line tool for an application project developed using pyACL APIs, the operation of opening files in the relative path cannot be contained in the Python script of the project. Otherwise, an error will be reported during profile data collection.
- The following table lists the collection switches supported in Ascend virtual instance scenarios.
- The following uses the msprof command line and Profiling options switch as examples.
- For
Atlas 200/300/500 Inference Product s, Ascend virtual instances are not involved.
Table 1 Collection switches supported in Ascend virtual instance scenarios Switch
Collected Content
Atlas Training Series Product msproftx
msproftx
Supported
host-sys=cpu
CPU
Supported
host-sys=mem
Memory
Supported
host-sys=disk
Disk
Not supported
host-sys=network
Network
Supported
host-sys=osrt
osrt
Not supported
ascendcl
ACL
Supported
model-execution
GE
Supported
runtime-api
task-time
task_trace (options)
Runtime
Supported
hccl
task_trace (options)
HCCL
Supported
aicpu
DATAPROCESS
Supported
aic-metrics
aic-mode=task-base
AI Core (task-based)
Supported
training_trace (options)
task-time
task_trace (options)
TSFW
Not supported
l2
L2 Cache
Not supported
sys-io-profiling
NIC, RoCE
Not supported
dvpp-profiling
DVPP
Not supported
sys-hardware-mem
On-chip memory and LLC
Not supported
sys-interconnection-profiling
PCIe, HCCS
Not supported
sys-cpu-profiling
AI CPU, CTRL CPU, TS CPU
Not supported
sys-profiling
sys-pid-profiling
CPU, memory
Not supported
aic-metrics
aic-mode=sample-base
AI Core (sample-based)
Not supported
task-time
task_trace (options)
HWTS_LOG
Supported
training_trace (options)
FMK
Supported