昇腾社区首页
中文
注册

如何指定Kernel.o文件运行CAModel生成流水图

问题描述

当算子性能无法满足实际业务需求,可跳过编译阶段,指定算子Kernel.o文件运行CAModel生成流水图,快速定位算子性能问题。

可能的原因

处理方案

  1. 获取日志文件。

    无论命令行或API方式,日志落盘地址由NPU调测参数 > log-fileset_log_file接口指定,缺省为当前操作路径的debug_op.log。请根据实际路径打开日志文件。

  2. 拷贝日志文件中与PATH、LD_LIBRARY_PATH、camodel run start相关的内容。
    在日志中搜索“update PATH path”、“update LD_LIBRARY_PATH path”、“camodel run start”关键字,截取时间最近一次的内容。
    [CONSOLE] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,484 ==================== npu kernel run end, takes 4438310.0(us) ====================
    [CONSOLE] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,485 compare output and golden data start
    [CONSOLE] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,486 Gen data compare result file: /home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/npu/output/y.txt
    [CONSOLE] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,486 compare output and golden data end
    [CONSOLE] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,490 OpInfoConfig(json_file='', op_type='ForeachSigmoid', data_script='', gen_data=False, args=[TensorListDesc(tensors=[TensorDesc(name='x', dtype='float16', fmt='ND', shape=[1, 4], ori_fmt='ND', ori_shape=[1, 4], data_file='/home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/data/x.bin', data_value=None, data_type='data_file', param_type='required', ignore=False, is_input=True)], is_input=True), TensorListDesc(tensors=[TensorDesc(name='y', dtype='float16', fmt='ND', shape=[1, 4], ori_fmt='ND', ori_shape=[1, 4], data_file='/home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/data/y.bin', data_value=None, data_type='data_file', param_type='required', ignore=False, is_input=False)], is_input=False)], attrs=[], chk_dump_path='', kernel_info=None, test_data=TestData(input_files=['/home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/data/x.bin'], golden_files=['/home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/data/y.bin']))
    [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,492 launch_info: EntryKernelInfo(kernel_name='foreach_sigmoid', kernel_args=[KernelArgInfo(arg_name='x', arg_type='__gm__ uint8_t*', arg_cls=<KernelArgType.Tensor: 2>, arg_orignal=True), KernelArgInfo(arg_name='y', arg_type='__gm__ uint8_t*', arg_cls=<KernelArgType.Tensor: 2>, arg_orignal=True), KernelArgInfo(arg_name='workspace', arg_type='__gm__ uint8_t*', arg_cls=<KernelArgType.WORKSPACE: 3>, arg_orignal=True), KernelArgInfo(arg_name='tiling', arg_type='__gm__ uint8_t*', arg_cls=<KernelArgType.TILING: 4>, arg_orignal=True)], dump_size=0)
    [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,492 --------arg_info.arg_type: __gm__ uint8_t*
    [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,492 --------arg_info.arg_type: __gm__ uint8_t*
    [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,492 --------arg_info.arg_type: __gm__ uint8_t*
    [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,492 --------arg_info.arg_type: __gm__ uint8_t*
    [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,492 g++ -g -O0 -fPIC -shared -rdynamic /home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/simulator/src/_gen_args_foreach_sigmoid.cpp -o /home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/simulator/build/launch_args.so
    [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,632 update PATH path: /home/run_pkg/latest/toolkit/tools/ascendc_tools/npu_kernel_launch:/home/run_pkg/latest/toolkit/tools/ascendc_tools/npu_kernel_launch:/home/run_pkg/latest/toolkit/tools/ccec_compiler/bin:/home/run_pkg/latest/toolkit/tools/biprof/:/home/run_pkg/latest/toolkit/tools/ascendc_tools/:/home/run_pkg/latest/toolkit/tools/profiler/bin/:/home/run_pkg/latest/toolkit/python/site-packages/bin/:/home/run_pkg/latest/compiler/ccec_compiler/bin:/home/run_pkg/latest/compiler/bin:/root/bin:/root/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin
    [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,633 update LD_LIBRARY_PATH path: /home/run_pkg/latest/toolkit/tools/simulator/${chip_version}/lib/:/home/run_pkg/latest/toolkit/tools/aml/lib64:/home/run_pkg/latest/toolkit/tools/aml/lib64/plugin:/home/run_pkg/latest/opp/lib64:/home/run_pkg/latest/compiler/lib64:/home/run_pkg/latest/compiler/lib64/plugin/opskernel:/home/run_pkg/latest/compiler/lib64/plugin/nnengine:/home/run_pkg/latest/runtime/lib64:/home/run_pkg/latest/compiler/lib64/stub
    [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,633 update LD_LIBRARY_PATH path: /home/run_pkg/latest/toolkit/tools/simulator/${chip_version}/lib/:/home/run_pkg/latest/toolkit/tools/aml/lib64:/home/run_pkg/latest/toolkit/tools/aml/lib64/plugin:/home/run_pkg/latest/opp/lib64:/home/run_pkg/latest/compiler/lib64:/home/run_pkg/latest/compiler/lib64/plugin/opskernel:/home/run_pkg/latest/compiler/lib64/plugin/nnengine:/home/run_pkg/latest/runtime/lib64:/home/run_pkg/latest/compiler/lib64/stub:/home/run_pkg/latest/runtime/lib64/stub
    [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,657 npu_kernel_launch: /home/run_pkg/latest/toolkit/tools/ascendc_tools/npu_kernel_launch/npu_kernel_launch
    [CONSOLE] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,657 ==================== camodel run start ====================
    [CONSOLE] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,657 /home/run_pkg/latest/toolkit/tools/ascendc_tools/npu_kernel_launch/npu_kernel_launch --kernel /home/ascendebug_smoking_test/op_contrib/data/op-contrib/build_out/binary/${chip_version}/bin/foreach_sigmoid/ForeachSigmoid_0885a6586f8e7f8dc8d03c4dabc73ef4_high_performance.o --name ForeachSigmoid --json_file /home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/data/ForeachSigmoid.json --input_path /home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/data --output_path /home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/simulator/output --tiling_data /home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/tiling/tiling_data_tiling_key_1_block_dim_1_workspace_33554432.bin --tiling_key 1 --workspace 33554432 --block_dim 1 --timeout 1200 --device 0 --core_type VectorCore --arg_lib /home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/simulator/build/launch_args.so --camodel
    kernel name: ForeachSigmoid
    kernel file: /home/ascendebug_smoking_test/op_contrib/data/op-contrib/build_out/binary/${chip_version}/bin/foreach_sigmoid/ForeachSigmoid_0885a6586f8e7f8dc8d03c4dabc73ef4_high_performance.o
    ......
  3. 修改环境变量。
    1. 在任意终端窗口打开CANN环境变量文件,缺省路径为${INSTALL_DIR}/set_env.sh
    2. 设置如下环境变量,放开日志打印等级:
      export CAMODEL_LOG_PATH=${仿真日志所在路径}    # 提前新建文件夹用于保存CAModel运行日志
      export ASCEND_GLOBAL_LOG_LEVEL=3               # 设置日志级别为ERROR
      export ASCEND_SLOG_PRINT_TO_STDOUT=1           # 开启日志打屏,日志将不会保存在log文件中
    3. 拷贝步骤2日志中的内容,并替换原始PATH、LD_LIBRARY_PATH变量。
      export PATH=${"update PATH path"后的内容}
      export LD_LIBRARY_PATH=${"update LD_LIBRARY_PATH path"后的内容}
  4. (可选)修改Simulator的启动配置文件。

    Atlas A2训练系列产品/Atlas 800I A2推理产品需要进行本步骤操作。

    Simulator的启动配置文件路径默认为${INSTALL_DIR}/tools/simulator/${chip_version}/lib/pem_config_cloud.toml。其中${INSTALL_DIR}请替换为CANN软件安装后文件存储路径。例如,若安装的Ascend-cann-toolkit软件包,则安装后文件存储路径为:$HOME/Ascend/ascend-toolkit/latest。${chip_version}是昇腾AI处理器的版本。

    [LOG]
        disable_list            = [ "log" ]
        # 新增mte_log(数据搬运类日志)ccu_log(公共指令流日志,如指令发射/执行等)
        enable_list             = [ "instr_log", "instr_popped_log", "icache_log", "dcache_log", "reg_log", "mte_log", "ccu_log" ]
        # trace: 0, debug: 1, info: 2, warn: 3, error: 4, critical: 5, off: 6
        file_print_level        = 1
        screen_print_level      = 3
        flush_level             = 1
        rotating_file_size      = 134217728     # 0x8000000     # ~130MB
        rotating_file_number    = 2
  5. 启动CAModel模拟仿真。

    拷贝步骤2日志中“camodel run start”后的命令,直接执行。

    其中--kernel参数可替换为算子Kernel.o文件,--timeout参数可以指定CAModel仿真执行时间,仿真结果将保存至步骤3设置的CAMODEL_LOG_PATH变量路径中。其他参数说明请参见Simulator仿真参数

  6. 生成算子Kernel.o文件对应的流水图,具体步骤参见如何指定仿真日志和kernel.o文件解析生成流水图