图内算子不超时配置功能

功能简介

在网络推理过程中，如果某些算子执行异常或耗时过长，可能会导致任务长时间等待，从而引发整体阻塞或卡顿。为提升任务的可靠性与恢复能力，可以为算子执行设置超时时间阈值，一旦执行时间超过阈值，便触发恢复流程，以保障业务的快速恢复。然而，由于不同的网络模型和算子特性存在较大差异，统一设置阈值面临一定困难。

部分算子本身计算量大，执行时间较长。
阈值设置过短，会误触发恢复流程。
阈值设置过长，又可能延迟故障响应。

因此，为了兼顾稳定性和灵活性，需对部分算子设置“永不超时”标签，使这些算子不参与超时检测，从而避免误判并保证推理流程的连续性。

TorchAir提供了torchair.scope.op_never_timeout接口，通过对指定范围内的算子添加 _op_exec_never_timeout属性，设置算子不进行超时检测。

使用约束

本功能仅支持max-autotune模式。
算子融合场景下，若子算子配置了本功能，其无法继承到新的融合算子节点上。

使用方法

用户自行分析模型中可进行不超时设置的算子。
指定算子配置不超时。
使用如下with语句块（op_never_timeout），enable为bool类型，置为True时语句块内算子将自动添加 _op_exec_never_timeout属性，不会参与超时检测。

关于 _op_exec_never_timeout属性的详细介绍和约束请参见《CANN GE图引擎接口》中“数据类型>属性名列表”章节。
1
with torchair.scope.op_never_timeout(enable=True)

使用示例

import torch
import torch_npu, torchair
import logging
from torchair import logger

logger.setLevel(logging.DEBUG)
config = torchair.CompilerConfig()
config.mode = "max-autotune"
npu_backend = torchair.get_npu_backend(compiler_config=config)

class Model(torch.nn.Module):
    def __init__(self):
        super().__init__()
    def forward(self, x, y):
        x=x+1
        with torchair.scope.op_never_timeout(enable=True):
            x=x*y
            with torchair.scope.op_never_timeout(enable=False):
                y=y-1
        return x+y

model = Model()
model = torch.compile(model, backend=npu_backend, dynamic=False)
x = torch.randn(2, 2)
y = torch.randn(2, 2)
model(x, y)

示例中“x=x*y”计算属于Mul算子（设置不超时属性），“y=y-1”计算属于Sub算子（未设置不超时属性）。

设置成功后，参考TorchAir Python层日志开启Debug日志，可以看到类似的提示信息：

[DEBUG] TORCHAIR(993590,python):2025-10-30 17:42:58.380.700 [_scope_attr.py:38]993590 Set attribute _op_exec_never_timeout: True on op: Mul
[DEBUG] TORCHAIR(993590,python):2025-10-30 17:42:58.388.864 [_scope_attr.py:38]993590 Set attribute _op_exec_never_timeout: False on op: Sub

父主题： max-autotune模式功能