HCCL_MULTI_QP_THRESHOLD
Description
Sets the minimum amount of data shared by each QP during RDMA communication between ranks through multi-QPs.
The value of this environment variable must be an integer ranging from 1 to 8192, and the default value is 512, in KB.
- If the value of (data size of a single communication between ranks/the configured value of HCCL_RDMA_QPS_PER_CONNECTION) is less than the configured value of HCCL_MULTI_QP_THRESHOLD), the number of QPs is automatically reduced during HCCL execution so that the data size shared by each QP is greater than or equal to the value of HCCL_MULTI_QP_THRESHOLD. For example:
If the data size of a single communication between ranks is 1 MB, HCCL_RDMA_QPS_PER_CONNECTION is set to 4, and HCCL_MULTI_QP_THRESHOLD is set to 512, which requires that each QP needs to share at least 512 KB data, the number of QPs is reduced to 2 during HCCL execution, that is, only two QPs are used for data transmission between ranks.
- If the data size of a single communication between ranks is less than HCCL_MULTI_QP_THRESHOLD, single-QP data transmission is used.
- If the data size shared by each QP is greater than 512 KB and the HCCL Test tool is used to test the RDMA traffic (only the inter-device traffic is tested, and the HCCS link is not used), the delivery scheduling overhead in the multi-QP scenario deteriorates by less than 3% compared with that in the single-QP scenario.
You can use the environment variable HCCL_RDMA_QPS_PER_CONNECTION or HCCL_RDMA_QP_PORT_CONFIG_PATH to enable multi-QP communication.
Example
export HCCL_MULTI_QP_THRESHOLD=512
Restrictions
This environment variable supports only the single-operator calling mode and does not support the static graph mode.