P2P Stress Test
Function
Test whether a hardware fault occurs on the HCCS communication link between the specified source device and destination device and output the test result. It is advised that this function be used when the training accuracy is abnormal due to hardware faults of the HCCS communication link.
Item |
Time Required |
Whether NPU Training or Inference Is Affected |
Application Scenario |
|---|---|---|---|
P2P stress test |
1–5 minutes |
Yes |
An exception occurred during data copy between devices. |
Parameters
Table 2 lists only test-specific parameters. For details about other common parameters, see Common Parameters.
Example
ascend-dmi -dg -i bandwidth --type p2p -s -q
- Default mode
1 2 3 4 5 6 7 8 9
[***@***]# ascend-dmi -dg -i bandwidth --type p2p -s -q Summary: Arch: aarch64 Mode: ****** Time: 20250529-19:55:23 Hardware: bandwidth: PASS
- If an unsupported device is used to perform a P2P stress test, the following information is displayed:
1 2 3 4 5 6 7 8 9 10
[***@***]# ascend-dmi -dg -i bandwidth --type p2p -s -q Summary: Arch: aarch64 Mode: ****** Time: 20250529-19:51:57 Hardware: bandwidth: SKIP *** The current device does not support the p2p stress test.
Fault Check Items
Command Output |
Description |
|---|---|
PASS |
The stress test is passed, and the result is normal. |
SKIP |
The product or scenario does not support the P2P stress test. |
EMERGENCY_WARN |
Emergency warning. The stress test fails. Contact Huawei engineers to replace the hardware. |
FAIL |
The P2P stress test fails. Contact Huawei technical support. |