EI0015 Ranktable_Detect_Failed
Symptom
Failed to collect cluster information of the communicator based on rootInfo detection. Reason: %s.
Solution
- Check whether all ranks in the communicator have delivered the communicator creation interface.
- Check the connectivity between the host networks of all nodes and the server node.
- Check whether the HCCL_SOCKET_IFNAME environment variable of all nodes is correctly configured.
- Increase the timeout by configuring the HCCL_CONNECT_TIMEOUT environment variable.
父主题: HCCL Errors