使用方式

ait llm dump --exec "bash run.sh patched/models/modeling_xxx.py" [可选参数]

参数说明

参数名	说明	是否必选
--exec	指定拉起执行大模型推理脚本的命令。命令中不支持重定向字符，如果需要重定向输出，建议将执行命令写入shell脚本，然后启动shell脚本。使用示例： --exec "bash run.sh patches/models/modeling_xxx.py"。	是
--type	转储类型，可选范围：['model', 'layer', 'op', 'kernel', 'tensor', 'cpu_profiling', 'onnx']，分别表示保存Model拓扑信息、Layer拓扑信息、算子信息、kernel算子信息、Tensor数据、cpu_profiling数据、ONNX模型。其中'onnx'需要和'model'、'layer'组合使用，用于将Model和Layer的拓扑信息转换成ONNX，可视化模型结构。默认为['tensor']。使用方式：--type layer tensor	否
-sd，--only-save-desc	只保存Tensor描述信息开关，开启开关时将转储（Dump）Tensor的描述信息。默认为否。使用方式：-sd	否
-ids，--save-operation-ids	设置转储指定id的算子的Tensor，默认为空，全量转储。使用方式：-ids 2, 3_1 表示只Dump第2个Operation和第3个Operation的第1个算子的数据，id从0开始。若不确定算子id，可以先执行ait llm dump --exec xx --type model命令，将Model信息转储下来，即可获得模型中所有的算子id信息。	否
-er，--execute-range	指定转储的token轮次范围，区间左右全闭，可以支持多个区间序列。默认为第0次。使用方式：-er 1,3 或 -er 3,5,7,7（代表区间[3, 5], [7, 7]，也就是第3，4，5，7次token）。	否
-child，--save-operation-child	选择是否转储所有子操作的Tensor数据，仅使用ids场景下有效。默认为“True”。使用方式：-child True	否
-time，--save-time	选择保存的时间节点，取值[0, 1, 2]，“0”代表保存执行前（before），“1”代表保存执行后（after），“2”代表前后都保存。默认保存after。使用方式：-time 0	否
-opname，--operation-name	指定需要转储的算子类型，只需要指定算子名称的开头，可以模糊匹配，如selfattention只需要填写self。使用方式：-opname self	否
-tiling，--save-tiling	选择是否需要保存tiling数据，默认为false。使用方式：-tiling	否
--save-tensor-part, -stp	指定保存Tensor的部分，“0”为仅intensor，“1”为仅outtensor，“2”为全部保存。默认为2。使用示例：-stp 1	否
-o, --output	指定转储数据的输出目录。默认为'./'。使用示例：-o aasx/sss	否
-device, --device-id	指定转储数据的Device id。默认为“None”表示不限制。如指定 --device-id 1，将只转储Device id为“1”的设备数据。	否
-l, --log-level	指定日志等级。默认为“info”。可选值：debug、info、warning、error、fatal、critical	否

转储落盘位置

转储默认落盘路径“{DUMP_DIR}”在当前目录下，如果指定output目录，落盘路径则为指定的“{OUTPUT_DIR}”。

Tensor信息会生成在默认落盘路径的“ait_dump”目录下，具体路径是 {DUMP_DIR}/ait_dump/tensors/{device_id}_{PID}/{TID}目录下（使用老版本的CANN包可能导致Tensor落盘路径不同）。
Layer信息会生成在默认落盘路径的“ait_dump”目录下，具体路径是 {DUMP_DIR}/ait_dump/layer/{PID}目录下。
Model信息会生成在默认落盘路径的“ait_dump”目录下，具体路径是 {DUMP_DIR}/ait_dump/model/{PID}目录下。
由于Model由Layer组合而成，因此使用Model时，默认同时会落盘Layer信息。
ONNX需要和Layer、Model配合使用，落盘位置和Model、Layer相同的目录。
cpu_profiling信息会生成在默认落盘路径的“ait_dump”目录下，具体路径为{DUMP_DIR}/ait_dump/cpu_profiling/{TIMESTAMP}/operation_statistic_{executeCount}.txt。
算子信息会生成在默认落盘路径的“ait_dump”目录下，具体路径是 {DUMP_DIR}/ait_dump/operation_io_tensors/{PID}/operation_tensors_{executeCount}.csv。
kernel算子信息会生成在默认落盘路径的“ait_dump”目录下，具体路径是 {DUMP_DIR}/ait_dump/kernel_io_tensors/{PID}/kernel_tensors_{executeCount}.csv。

{device_id}为设备号，{PID}为进程号，{TID}为token_id，{TIMESTAMP}为时间戳，{executeCount}为Operation运行次数。

加速库模型数据转储

使用方式

参数说明

转储落盘位置