HCCL_RDMA_QP_PORT_CONFIG_PATH
Description
By default, one queue pair (QP) is created for data transfer during RDMA communication between two ranks. If you want to use multiple QPs for RDMA communication between two ranks and specify the source port numbers used for multi-QP communication, you can use this environment variable.
You can use this environment variable to specify the path for storing the configuration file that configures the mapping between <srcIP,dstIP> and ports. When multiple port numbers are configured for <srcIP,dstIP>, the system enables multi-QP communication, and the configured port numbers are the source ports used by each QP.
Environment variable configuration (example):
export HCCL_RDMA_QP_PORT_CONFIG_PATH=/home/tmp
/home/tmp indicates the path for storing the configuration file MultiQpSrcPort.cfg of the mapping between <srcIP,dstIP> and the ports. The path can be an absolute path or a relative path, with a maximum of 4096 characters.
The MultiQpSrcPort.cfg file needs to be customized by the user. Note that the file name must be MultiQpSrcPort.cfg. The configuration format is as follows:
srcIP1,dstIP1=srcPort0,srcPort1,...,srcPortN srcIPN,dstIPN=srcPort0,srcPort1,...,srcPortN
- The maximum number of lines that can be configured in the file is 131072 (128 × 1024).
- Each <srcIP,dstIP> address pair supports a maximum of 32 ports. However, it is recommended that the number of ports be less than or equal to 8 for an address pair. If the number of QPs exceeds 8, the performance gain cannot be ensured and the service may fail to run due to excessive memory usage.
- Each <srcIP, dstIP> address pair can appear only once in the file.
- srcIP and dstIP must be in IPv4 format rather than IPv6 format.
- srcIP and dstIP can be set to 0.0.0.0, indicating all IP addresses.
The following is a configuration example of the MultiQpSrcPort.cfg file:
192.168.100.2,192.168.100.3=61100,61101,61102 192.168.100.4,192.168.100.5=61100,61101,61102,61104 0.0.0.0,192.168.100.122=65515,65516,65513
Example
export HCCL_RDMA_QP_PORT_CONFIG_PATH=/home/tmp
Restrictions
- This environment variable supports only the single-operator calling mode and does not support the static graph mode.
- The priority of this environment variable is higher than that of the environment variable HCCL_RDMA_QPS_PER_CONNECTION. After this environment variable is set, the number of QPs used for communication between two ranks is subject to the number of source port numbers configured in the MultiQpSrcPort.cfg file.
- The QP configuration priority is as follows:
Multi-QP configuration on the management plane (configured using the -s multi_qp parameter of hccn_tool) > QP configuration of the NSLB (configured using the -t nslb-dp parameter of hccn_tool) > Environment variable HCCL_RDMA_QP_PORT_CONFIG_PATH > Environment variable HCCL_RDMA_QPS_PER_CONNECTION