算子调用报错

问题现象

关键词"Cannot find bin of op ..."

Traceback (most recent call last):
  File "/home/HwHiAiUser/workspace/qwen2.5-Math-deepseek-R1.py", line 38, in <module>
    generated_ids = model.generate(
  File "/home/HwHiAiUser/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/HwHiAiUser/.local/lib/python3.10/site-packages/transformers/generation/utils.py", line 1575, in generate
    result = self._sample(
  File "/home/HwHiAiUser/.local/lib/python3.10/site-packages/transformers/generation/utils.py", line 2690, in _sample
    model_kwargs["cache_position"] = torch.arange(cur_len, device=input_ids.device)
RuntimeError: call aclnnArange failed, detail:EZ9999: Inner Error!
EZ9999: [PID: 775] 2025-02-16-12:18:25.258.321 Cannot find bin of op Range, integral key 0/1/|int64/ND/int64/ND/int64/ND/int64/ND/.
        TraceBack (most recent call last):
       Cannot find binary for op Range.
       Kernel GetWorkspace failed. opType: 9
       ArangeAiCore ADD_TO_LAUNCHER_LIST_AICORE failed.
[ERROR] 2025-02-16-12:18:25 (PID:775, Device:0, RankID:-1) ERR01100 OPS call acl api failed

故障根因

关键过程:算子调用报错。

根本原因分析:没有找到调用算子的二进制文件。

处理方法

Error Code

ERR01100

故障事件名称

算子调用报错

故障解释/可能原因

  1. 没有安装配套Kernels包
  2. 没有匹配的算子二进制文件

故障影响

算子调用报错退出

故障自处理模式

  1. 检查安装配套的Kernels包
  2. 查看算子输入数据类型是否支持,如果不支持请使用支持的数据类型,可参考《CANN 算子加速库》中“CANN算子规格说明”章节进行查看

系统处理建议

无需操作