昇腾社区首页
中文
注册
开发者
下载

功能验证

  1. 编译自定义算子包。

    参考环境准备准备好环境,执行如下命令重新编译、安装自定义算子torch.ops.npu.my_op的torch_npu包。请注意与当前运行环境的Python版本匹配,以Python3.8版本为例:

    bash ci/build.sh --python=3.8
    pip3 install dist/torch*.whl --force-reinstall --no-deps
  2. 验证自定义算子在Eager模式、TorchAir reduce-overhead模式、TorchAir max-autotune模式下功能是否正常
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    import torch
    import torch_npu
    import torchair
    
    def test_eager(x, y, z, attr1, attr2):
        return torch.ops.npu.my_op(x, y, z, attr1, attr2)
    
    config = torchair.CompilerConfig()
    config.mode = "reduce-overhead"        # 表示aclgraph模式
    @torch.compile(backend=torchair.get_npu_backend(compiler_config=config))
    def test_torchair_reduce_overhead(x, y, z, attr1, attr2):
        return torch.ops.npu.my_op(x, y, z, attr1, attr2)
    
    config = torchair.CompilerConfig()
    config.mode = "max-autotune"          # 表示Ascend IR模式
    @torch.compile(backend=torchair.get_npu_backend(compiler_config=config))
    def test_torchair_max_autotune(x, y, z, attr1, attr2):
        return torch.ops.npu.my_op(x, y, z, attr1, attr2)
    
    x = torch.ones(4, 8).npu()
    y = None
    z = [torch.ones(4, 8).npu(), torch.ones(4, 8).npu()]
    attr1 = 2.0
    attr2 = 5
    
    test_eager(x, y, z, attr1, attr2)
    torch.npu.synchronize()
    print("Eager ok")
    test_torchair_reduce_overhead(x, y, z, attr1, attr2)
    torch.npu.synchronize()
    print("TorchAir-reduce-overhead ok")
    test_torchair_max_autotune(x, y, z, attr1, attr2)
    torch.npu.synchronize()
    print("TorchAir-max-autotune ok")