Eye Diagram Diagnosis

Function

Diagnose the signal quality and output the diagnosis result.

Table 1 Diagnostic items

Item

Time Required

Whether NPU Training or Inference Is Affected

Application Scenario

signalQuality

10s~2min

No

A PCIe, HCCS, or RoCE link fault occurs on a training or inference job.

Parameters

Table 2 lists only test-specific parameters. For details about other common parameters, see Common Parameters.

Table 2 Parameter description

Parameter

Description

Mandatory

[-i, --items]

Specifies the diagnosis check item.
  • signalQuality indicates signal quality diagnosis of PCIe, HCCS, and RoCE.

Yes

[-lt, --lt ,--link-type]

Specifies the link type for eye diagram diagnosis. The options are as follows:

  • hccs: eye diagram diagnosis of the HCCS link; supported by the Atlas 300I Duo inference card, Atlas A2 training product, Atlas 800I A2 inference server, A200I A2 Box heterogeneous component, Atlas A3 inference product, and Atlas A3 training product; not supported by the Atlas 800I A2 inference server (32 GB PCIe).
  • pcie: eye diagram diagnosis of the PCIe link; supported by the Atlas 300I Duo inference card, Atlas A2 training product, Atlas 800I A2 inference server, A200I A2 Box heterogeneous component, and A200T A3 Box8 SuperPoD Server. When only the secondary chip is specified for the Atlas 300I Duo inference card, the signal quality of the PCIe link is not diagnosed.
  • roce: eye diagram diagnosis of the RoCE link; supported by the Atlas A2 training product, Atlas 800I A2 inference server, A200I A2 Box heterogeneous component, Atlas A3 inference product, and Atlas A3 training product.

No

Example

ascend-dmi -dg -i signalQuality --link-type hccs,pcie

1
2
3
4
5
6
7
8
9
[***@***]# ascend-dmi -dg -i signalQuality --link-type hccs,pcie
Summary:
    Arch: aarch64
    Mode: ******
    Time: 20250529-19:24:32
 
Hardware:
    signalQuality:
        PASS

Fault Check Items

Table 3 Fault check items

Command Output

Description

PASS

The check is passed, and the signal quality is normal.

SKIP

  • The eye diagram diagnosis is not supported.
  • The link type specified by link-type is not supported.

IMPORTANT_WARN

Important warning.

At least one of the signal qualities for PCIe, HCCS, or RoCE is abnormal. Contact Huawei technical support.

FAIL

The eye diagram detection fails.

Note:

In the signal quality diagnosis, if the values of SNR and HEH are 0, no RoCE or HCCS link is established between the specified devices.