HCCL_RDMA_SL配置错误
问题现象
在打印日志中存在关键字"EI0001"或"Environment variable [***] is invalid.",如下所示:
[PID:3729526]2025-10-23-17:30:40.098.984Invalid_Environment_Variable_Configuration(EI0001): Environment variable [HCCL_RDMA_SL] is invalid. Reason: Value range[0, 7].
Possible Cause: The environment variable configuration is invalid.
Solution: Try again with valid environment variable configuration.
或在CANN日志的ERROR日志中存在关键字"externalinput.cc",表示是在读取环境变量配置时报错,如下所示:
[ERROR]HCCL(3729526,python3.11):2025-10-23-17:30:40.098.973 [externalinput.cc:963] [3729526][Parse][rdmaServerLevel]HCCL_RDMA_SL[1000] is invalid. except: [0, 7] [ERROR]HCCL(3729526,python3.11):2025-10-23-17:30:40.099.058 [externalinput.cc:169] [3729526][Init][EnvVarParam]errNo[0x0000000005000001] In init env variable param, parse HCCL_RDMA_SL failed. errno[1] [ERROR]HCCL(3729526,python3.11):2025-10-23-17:30:40.099.063 [externalinput.cc:47] [3729526][InitExternalInput]call trace: hcclRet -> 1 [ERROR]HCCL(3729526,python3.11):2025-10-23-17:30:40.099.068 [op_base.cc:866] [3729526][HcclGetRootInfo]call trace: hcclRet -> 1
可能的原因及解决方法
环境变量配置参数不符合要求,请基于日志打印的建议调整取值范围,如果仍然有疑问,请参照对应环境变量参考。
父主题: 环境变量配置异常(EI0001)