HCCL_INTRA_PCIE_ENABLE

Description

Specifies whether to use the PCIe link for communication on a server.

The default value of this environment variable is 1. It can be configured separately or used together with the environment variable HCCL_INTRA_ROCE_ENABLE. The following table lists the communication links used on a server in different configuration combinations.

Table 1 Configuration combinations supported by HCCL_INTRA_PCIE_ENABLE and HCCL_INTRA_ROCE_ENABLE

HCCL_INTRA_PCIE_ENABLE

HCCL_INTRA_ROCE_ENABLE

Intra-Server Communication Link

1

Not configured

PCIe

1

0

PCIe

0

1

RoCE

Not configured

1

RoCE

0

0

PCIe

Not configured

Not configured

PCIe

HCCL_INTRA_PCIE_ENABLE and HCCL_INTRA_ROCE_ENABLE cannot be set to 1 at the same time.

Example

export HCCL_INTRA_PCIE_ENABLE=1

Restrictions

The Atlas 200T A2 Box16 heterogeneous subrack has two modules on the left and right: devices 0 to 7 and devices 8 to 15. For this product:

In the single-server scenario, when the server uses a PCIe link for internal communication, if the devices from two modules are required simultaneously, both modules must have the same number of devices and be on the same plane, meaning that devices 0 and 8, 1 and 9 (and so on) must be used together. When the server uses a RoCE link for internal communication, there is no such restriction.

Applicability

Atlas training products: Only the Atlas 300T Pro training card is supported.

Atlas A2 training products/Atlas A2 inference products: Only the Atlas 200T A2 Box16 heterogeneous subrack with this processor model is supported.