如何指定Kernel.o文件运行CAModel生成流水图
问题描述
当算子性能无法满足实际业务需求,可跳过编译阶段,指定算子Kernel.o文件运行CAModel生成流水图,快速定位算子性能问题。
可能的原因
无
处理方案
- 获取日志文件。
无论命令行或API方式,日志落盘地址由NPU调测参数 > log-file或set_log_file接口指定,缺省为当前操作路径的debug_op.log。请根据实际路径打开日志文件。
- 拷贝日志文件中与PATH、LD_LIBRARY_PATH、camodel run start相关的内容。在日志中搜索“update PATH path”、“update LD_LIBRARY_PATH path”、“camodel run start”关键字,截取时间最近一次的内容。
[CONSOLE] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,484 ==================== npu kernel run end, takes 4438310.0(us) ==================== [CONSOLE] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,485 compare output and golden data start [CONSOLE] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,486 Gen data compare result file: /home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/npu/output/y.txt [CONSOLE] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,486 compare output and golden data end [CONSOLE] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,490 OpInfoConfig(json_file='', op_type='ForeachSigmoid', data_script='', gen_data=False, args=[TensorListDesc(tensors=[TensorDesc(name='x', dtype='float16', fmt='ND', shape=[1, 4], ori_fmt='ND', ori_shape=[1, 4], data_file='/home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/data/x.bin', data_value=None, data_type='data_file', param_type='required', ignore=False, is_input=True)], is_input=True), TensorListDesc(tensors=[TensorDesc(name='y', dtype='float16', fmt='ND', shape=[1, 4], ori_fmt='ND', ori_shape=[1, 4], data_file='/home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/data/y.bin', data_value=None, data_type='data_file', param_type='required', ignore=False, is_input=False)], is_input=False)], attrs=[], chk_dump_path='', kernel_info=None, test_data=TestData(input_files=['/home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/data/x.bin'], golden_files=['/home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/data/y.bin'])) [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,492 launch_info: EntryKernelInfo(kernel_name='foreach_sigmoid', kernel_args=[KernelArgInfo(arg_name='x', arg_type='__gm__ uint8_t*', arg_cls=<KernelArgType.Tensor: 2>, arg_orignal=True), KernelArgInfo(arg_name='y', arg_type='__gm__ uint8_t*', arg_cls=<KernelArgType.Tensor: 2>, arg_orignal=True), KernelArgInfo(arg_name='workspace', arg_type='__gm__ uint8_t*', arg_cls=<KernelArgType.WORKSPACE: 3>, arg_orignal=True), KernelArgInfo(arg_name='tiling', arg_type='__gm__ uint8_t*', arg_cls=<KernelArgType.TILING: 4>, arg_orignal=True)], dump_size=0) [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,492 --------arg_info.arg_type: __gm__ uint8_t* [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,492 --------arg_info.arg_type: __gm__ uint8_t* [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,492 --------arg_info.arg_type: __gm__ uint8_t* [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,492 --------arg_info.arg_type: __gm__ uint8_t* [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,492 g++ -g -O0 -fPIC -shared -rdynamic /home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/simulator/src/_gen_args_foreach_sigmoid.cpp -o /home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/simulator/build/launch_args.so [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,632 update PATH path: /home/run_pkg/latest/toolkit/tools/ascendc_tools/npu_kernel_launch:/home/run_pkg/latest/toolkit/tools/ascendc_tools/npu_kernel_launch:/home/run_pkg/latest/toolkit/tools/ccec_compiler/bin:/home/run_pkg/latest/toolkit/tools/biprof/:/home/run_pkg/latest/toolkit/tools/ascendc_tools/:/home/run_pkg/latest/toolkit/tools/profiler/bin/:/home/run_pkg/latest/toolkit/python/site-packages/bin/:/home/run_pkg/latest/compiler/ccec_compiler/bin:/home/run_pkg/latest/compiler/bin:/root/bin:/root/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,633 update LD_LIBRARY_PATH path: /home/run_pkg/latest/toolkit/tools/simulator/${chip_version}/lib/:/home/run_pkg/latest/toolkit/tools/aml/lib64:/home/run_pkg/latest/toolkit/tools/aml/lib64/plugin:/home/run_pkg/latest/opp/lib64:/home/run_pkg/latest/compiler/lib64:/home/run_pkg/latest/compiler/lib64/plugin/opskernel:/home/run_pkg/latest/compiler/lib64/plugin/nnengine:/home/run_pkg/latest/runtime/lib64:/home/run_pkg/latest/compiler/lib64/stub [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,633 update LD_LIBRARY_PATH path: /home/run_pkg/latest/toolkit/tools/simulator/${chip_version}/lib/:/home/run_pkg/latest/toolkit/tools/aml/lib64:/home/run_pkg/latest/toolkit/tools/aml/lib64/plugin:/home/run_pkg/latest/opp/lib64:/home/run_pkg/latest/compiler/lib64:/home/run_pkg/latest/compiler/lib64/plugin/opskernel:/home/run_pkg/latest/compiler/lib64/plugin/nnengine:/home/run_pkg/latest/runtime/lib64:/home/run_pkg/latest/compiler/lib64/stub:/home/run_pkg/latest/runtime/lib64/stub [INFO] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,657 npu_kernel_launch: /home/run_pkg/latest/toolkit/tools/ascendc_tools/npu_kernel_launch/npu_kernel_launch [CONSOLE] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,657 ==================== camodel run start ==================== [CONSOLE] ascendc_debug_tool [3626213] 2024-05-21 19:15:40,657 /home/run_pkg/latest/toolkit/tools/ascendc_tools/npu_kernel_launch/npu_kernel_launch --kernel /home/ascendebug_smoking_test/op_contrib/data/op-contrib/build_out/binary/${chip_version}/bin/foreach_sigmoid/ForeachSigmoid_0885a6586f8e7f8dc8d03c4dabc73ef4_high_performance.o --name ForeachSigmoid --json_file /home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/data/ForeachSigmoid.json --input_path /home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/data --output_path /home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/simulator/output --tiling_data /home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/tiling/tiling_data_tiling_key_1_block_dim_1_workspace_33554432.bin --tiling_key 1 --workspace 33554432 --block_dim 1 --timeout 1200 --device 0 --core_type VectorCore --arg_lib /home/ascendebug_smoking_test/op_contrib/api_opcontrib_case/ForeachSigmoid/simulator/build/launch_args.so --camodel kernel name: ForeachSigmoid kernel file: /home/ascendebug_smoking_test/op_contrib/data/op-contrib/build_out/binary/${chip_version}/bin/foreach_sigmoid/ForeachSigmoid_0885a6586f8e7f8dc8d03c4dabc73ef4_high_performance.o ......
- 修改环境变量。
- 在任意终端窗口打开CANN环境变量文件,缺省路径为${INSTALL_DIR}/set_env.sh。
- 设置如下环境变量,放开日志打印等级:
export CAMODEL_LOG_PATH=${仿真日志所在路径} # 提前新建文件夹用于保存CAModel运行日志 export ASCEND_GLOBAL_LOG_LEVEL=3 # 设置日志级别为ERROR export ASCEND_SLOG_PRINT_TO_STDOUT=1 # 开启日志打屏,日志将不会保存在log文件中
- 拷贝步骤2日志中的内容,并替换原始PATH、LD_LIBRARY_PATH变量。
export PATH=${"update PATH path"后的内容} export LD_LIBRARY_PATH=${"update LD_LIBRARY_PATH path"后的内容}
- (可选)修改Simulator的启动配置文件。
仅Atlas A2训练系列产品/Atlas 800I A2推理产品需要进行本步骤操作。
Simulator的启动配置文件路径默认为${INSTALL_DIR}/tools/simulator/${chip_version}/lib/pem_config_cloud.toml。其中${INSTALL_DIR}请替换为CANN软件安装后文件存储路径。例如,若安装的Ascend-cann-toolkit软件包,则安装后文件存储路径为:$HOME/Ascend/ascend-toolkit/latest。${chip_version}是昇腾AI处理器的版本。
[LOG] disable_list = [ "log" ] # 新增mte_log(数据搬运类日志)、ccu_log(公共指令流日志,如指令发射/执行等) enable_list = [ "instr_log", "instr_popped_log", "icache_log", "dcache_log", "reg_log", "mte_log", "ccu_log" ] # trace: 0, debug: 1, info: 2, warn: 3, error: 4, critical: 5, off: 6 file_print_level = 1 screen_print_level = 3 flush_level = 1 rotating_file_size = 134217728 # 0x8000000 # ~130MB rotating_file_number = 2
- 启动CAModel模拟仿真。
拷贝步骤2日志中“camodel run start”后的命令,直接执行。
其中--kernel参数可替换为算子Kernel.o文件,--timeout参数可以指定CAModel仿真执行时间,仿真结果将保存至步骤3设置的CAMODEL_LOG_PATH变量路径中。其他参数说明请参见Simulator仿真参数。
- 生成算子Kernel.o文件对应的流水图,具体步骤参见如何指定仿真日志和kernel.o文件解析生成流水图。