On-Chip Memory Stress Test Fails Due to Insufficient Device Memory
Symptom
Ascend DMI fails to perform the on-chip memory stress test, and the message "Error occurred in HBM stress test on device 0" is displayed. In addition, the error message "aclrtMalloc failed, error code: 207001" is displayed in the log.

The following information is printed in /var/log/ascend-dmi/ascend-dmi.log:

Possible Causes
The device memory is insufficient or occupied.
Solution
- Run the npu-smi info command to check whether the memory is used up. If the following information is displayed, the memory is used up.

- Wait for the memory to be released or run the following command to reset the processor to release the memory:
npu-smi set -t reset -i $i -c 0 // Replace $i with the specified device ID.
Figure 1 Command example
Parent topic: On-Chip Memory Stress Test