HCCL_RDMA_PCIE_DIRECT_POST_NOSTRICT
Description
Submits the RDMA tasks in PCIe Direct mode in multi-server communication scenarios where the host OS uses non-4 KB memory pages and the communication operator delivery performance encounters the host bound. This helps improve the communication operator delivery performance.
- TRUE: The RDMA task is submitted in PCIe Direct mode (high-speed communication interface between the host and device).
- FALSE (default): The RDMA task is submitted in host device communication (HDC) mode.
This environment variable takes effect only when the size of the small-page memory page table on the host is not 4 KB. If the size is 4 KB, RDMA tasks are submitted in PCIe Direct mode regardless of the value of this environment variable.
- When this environment variable is set to TRUE, extra huge page memory on the device is occupied (each communication link occupies extra 1 MB huge page memory).
- If you want to use this environment variable to improve the delivery performance of communication operators and reduce the huge page memory usage on the device, you can set the inter-server communication algorithm to ring using HCCL_ALGO to control the number of communication links.
export HCCL_ALGO="level0:NA;level1:ring"
Example
export HCCL_RDMA_PCIE_DIRECT_POST_NOSTRICT=TRUE
Restrictions
- Multi-server communication scenario.
- Scenario where the size of the small page memory table managed by the host OS is not 4 KB.