AICORE Stress Test
Function
Perform a stress test on AICORE errors and output the diagnosis result.
Item |
Time Required |
Whether NPU Training or Inference Is Affected |
Application Scenario |
|---|---|---|---|
AICORE stress test |
9–24 minutes |
Yes |
An AICORE error occurs when a training or inference job is executed. |
- The AICORE stress test and AICORE diagnosis apply to different scenarios. For details, see Table 1. Perform the AICORE stress test and AICORE diagnosis as required.
- If you want to conduct the AICORE, full on-chip memory, and P2P stress tests at the same time, refer to One-Click Diagnosis.
Parameters
Table 2 lists only test-specific parameters. For details about other common parameters, see Common Parameters.
Parameter |
Description |
Mandatory |
|---|---|---|
[-i, --items] |
Specifies the diagnosis check item.
|
Yes |
[-s, --stress] |
Performs a stress test. |
Yes |
[-sc, --sc, --stress-count] |
Specifies the number of AICORE stress tests.
|
No |
Example
Example of setting the number of stress tests to 3:
ascend-dmi -dg -i aicore -s -sc 3 -q
1 2 3 4 5 6 7 8 9 10 | [***@***]# ascend-dmi -dg -i aicore -s -sc 3 -q Stress test is being performed, please wait. Summary: Arch: aarch64 Mode: ****** Time: 20250529-19:51:09 Hardware: aicore: PASS |
Fault Check Items
Command Output |
Description |
|---|---|
PASS |
The stress test result is normal. |
SKIP |
The product or scenario does not support the AICORE stress test. |
EMERGENCY_WARN |
Emergency warning. Replace the hardware. |
FAIL |
The AICORE stress test fails. Contact Huawei technical support. |