HCCL_RDMA_RETRY_CNT

Description

Configures the number of retransmission times of the RDMA NIC. The value must be an integer ranging from 1 to 7. The default value is 7.

Example

# Set the retransmission count to 5.
export HCCL_RDMA_RETRY_CNT=5

Restrictions

None

Applicability

Atlas A3 training products/Atlas A3 inference products

Atlas A2 training products/Atlas A2 inference products (For Atlas A2 training products/Atlas A2 inference products, only the Atlas 800T A2 training server, Atlas 900 A2 PoD cluster basic unit, and Atlas 200T A2 Box16 heterogeneous subrack are supported.)

Atlas training products

Atlas inference products (For the Atlas inference products, only the Atlas 300I Duo inference card is supported.)