msobjdump

This tool parses and decompresses the Executable and Linkable Format (ELF) file generated after operator compilation in any scenario, and displays the result in a readable format, helping you obtain the kernel file information.

The ELF file is a file format used for binary files, executable files, object code, shared libraries, and core dumps, such as *.a and *.so files. The ELF file structure is as follows:

  • ELF header: describes the structure of the entire file, including the file type, machine type, and version number.
  • Program header table: describes various segments in the file, including information about how a program is loaded to the memory for execution.
  • Section header table: describes the information about each section in the file, including the code, data, and symbol table of the program.

The ELF file is mainly used in Linux and other Unix-like operating systems for program execution and linking.

Tool Installation

  1. Install the msobjdump tool.

    The tool is released with the CANN package. (For details about how to install the CANN package, see Environment Setup.) The default path is ${INSTALL_DIR}/tools/msobjdump. Replace ${INSTALL_DIR} with the actual CANN component directory. If the Ascend-CANN-Toolkit package is installed as the root user, the CANN component directory is /usr/local/Ascend/ascend-toolkit/latest.

  2. Set environment variables.
    • Install the Ascend-CANN-Toolkit package as the root user:
      1
      2
      source /usr/local/Ascend/ascend-toolkit/set_env.sh
      source /usr/local/Ascend/ascend-toolkit/latest/toolkit/bin/setenv.bash
      
    • Install the Ascend-CANN-Toolkit package as a non-root user:
      1
      2
      source ${HOME}/Ascend/ascend-toolkit/set_env.sh
      source ${HOME}/Ascend/ascend-toolkit/latest/toolkit/bin/setenv.bash
      
  3. Check whether the tool is successfully installed.
    Run the following command. If --help or -h is displayed, the tool environment is normal and the functions are normal.
    1
    msobjdump -h
    

Functions

  • Parsing the ELF file

    To parse the internal information of the ELF file, such as the file length, file type, segment information of each file, and symbol table information, run the following command:

    1
    msobjdump --dump-elf ${elf_file}
    

    ${elf_file} indicates the path of the ELF file to be parsed, for example, /home/op_api/lib_api.so.

  • Decompressing the ELF file

    To decompress the ELF file and flush it to a specified directory, run the following command:

    1
    msobjdump --extract-elf ${elf_file} --out-dir ${out_path}
    

    ${elf_file} indicates the path of the ELF file to be decompressed, for example, /home/op_api/lib_api.so. ${out_path} indicates the directory of the flushed file, for example, /home/extract/. If --out-dir is not set, the tool flushes the decompressed file to the current execution path by default.

  • Obtaining the ELF file list

    To print the list of all device.o files for the devices, run the following command:

    1
    msobjdump --list-elf ${elf_file}
    

    ${elf_file} indicates the path of the ELF file to be printed, for example, /home/op_api/lib_api.so.

For details about all command-line options of the tool, see Table 1.

Table 1 msobjdump options

Option

(Case-Sensitive)

Description

Required (Yes/No)

-d, --dump-elf

Parses information such as the file length, file type, and symbol table of each device.o file contained in the ELF file, and displays the information on the device screen. For details about key fields, see Table 2.

Yes. Select one from the three.

-e, --extract-elf

Decompress each device.o or device.json file contained in the ELF file and save the file to the output path based on the original folder rules.

If the --out-dir option is not specified, the file is flushed to the current execution path by default.

-l, --list-elf

Displays the list of device.o files contained in the ELF file on the screen.

-o, --out-dir

Specifies the path of the decompressed file. This option must be used together with --extract-elf.

NOTE:

msobjdump can be called by multiple users concurrently. However, users need to specify different --out-dirs. Otherwise, the flushed content may be overwritten.

No

Table 2 Parsing the ELF fields

Key Field

Description

Required (Yes/No)

VERSION

Version number.

No

TYPE COUNT

Number of kernel files contained in the current section.

ELF FILE ${id}

Listed kernel files. The file names are combined in the sequence of ${sec_prefix}_${file_index}_${kernel_type}.o, where ${sec_prefix} indicates the section name (obtained by the tool based on the keyword .ascend.kernel), ${file_index} indicates the file index, and ${kernel_type} indicates the kernel type.

KERNEL TYPE

Type of the current kernel file. The mapping is {0: 'mix', 1: 'aiv', 2: 'aic'}.

KERNEL LEN

Length of the current kernel file.

elf header infos

Information such as the ELF header, section headers, key to flags, program headers, and symbol tables.

Yes

Example (Kernel Launch Operator Project)

Take the MatMulInvocationNeo operator as an example. For details about the complete operator project, see sample of matmul multi-core kernel launch. Assume that ${cmake_install_dir} is the root directory of the operator CMake compilation result. The result directory structure is similar to Compiling the CMake Build Configuration File.

1
2
3
4
5
6
7
out
├── lib 
   ├── libascendc_kernels_npu.so
├── include
   ├── ascendc_kernels_npu
           ├── aclrtlaunch_matmul_custom.h
           ├── aclrtlaunch_triple_chevrons_func.h

To parse and decompress the library files (such as *.so and *.a files) generated after compilation, run the following the commands:

  • Parsing each device.o file:
    1
    msobjdump --dump-elf ${cmake_install_dir}/out/libascendc_kernels_npu.so
    

    After this command is executed, the device screen displays all device information. The following is an example. For details about the fields, see Table 2.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    ===========================
    [VERSION]: 1
    [TYPE COUNT]: 1
    ===========================
    [ELF FILE 0]: ascendxxxb1_ascendc_kernels_npu_0_mix.o
    [KERNEL TYPE]: mix
    [KERNEL LEN]: 511560
    ====== [elf heard infos] ======
    ELF Header:
      Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
      Class:                             ELF64
      Data:                              2's complement, little endian
      Version:                           1 (current)
      OS/ABI:                            UNIX - System V
      ABI Version:                       0
      Type:                              EXEC (Executable file)
      Machine:                           <unknown>: 0x1029
      Version:                           0x1
      Entry point address:               0x0
      Start of program headers:          64 (bytes into file)
      Start of section headers:          510280 (bytes into file)
      Flags:                             0x940000
      Size of this header:               64 (bytes)
      Size of program headers:           56 (bytes)
      Number of program headers:         2
      Size of section headers:           64 (bytes)
      Number of section headers:         20
      Section header string table index: 18
    
    Section Headers:
      [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
      [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
      [ 1] .text             PROGBITS        0000000000000000 0000b0 010a08 00  AX  0   0  4
       .....................................................................................
      [19] .strtab           STRTAB          0000000000000000 071278 00b6cb 00      0   0  1
    Key to Flags:
      W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
      L (link order), O (extra OS processing required), G (group), T (TLS),
      C (compressed), x (unknown), o (OS specific), E (exclude),
      D (mbind), p (processor specific)
    
    There are no section groups in this file.
    
    Program Headers:
      Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
      LOAD           0x0000b0 0x0000000000000000 0x0000000000000000 0x010aa8 0x010aa8 R E 0x1000
      GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0
    ......
    
  • Decompressing each device.o file and flushing it to the disk:
    1
    msobjdump --extract-elf ${cmake_install_dir}/out/libascendc_kernels_npu.so
    

    After this command is executed, the ascendxxxb1_ascendc_kernels_npu_0_mix.o file is flushed to the disk in the current execution path by default.

  • Obtaining the list of device.o files:
    1
    msobjdump --list-elf ${cmake_install_dir}/out/libascendc_kernels_npu.so
    

    After this command is executed, all files are displayed on the device screen, as shown in the following:

    1
    ELF file    0: ascendxxxb1_ascendc_kernels_npu_0_mix.o
    

Example (Simple Custom Operator Project)

Take the following operator project as an example. Assume that ${cmake_install_dir} is the root directory of the operator CMake compilation result. The result directory structure is similar to Operator Project Build.

1
2
3
4
5
6
7
├── op_api
   ├── include
       ├── aclnn_acos_custom.h
       ├── aclnn_matmul_leakyrelu_custom.h
       ├── .........
   ├── lib
       ├── libcust_opapi.so

To parse and decompress the library files (such as *.so and *.a files) generated after compilation, run the following the commands:

  • Parsing each device.o file:
    1
    msobjdump --dump-elf ${cmake_install_dir}/op_api/lib/libcust_opapi.so 
    

    After this command is executed, the device screen prints the section and symbol table of each operator device.o file. The following is an example. For details about the fields, see Table 2.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    ===== [elf heard infos] in ascendxxx_acos_custom_AcosCustom_da824ede53d7e754f85c14b9446ec2fc.o =====:
    ELF Header:
      Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
      Class:                             ELF64
      Data:                              2's complement, little endian
      Version:                           1 (current)
      OS/ABI:                            UNIX - System V
      ............................................
      Size of program headers:           56 (bytes)
      Number of program headers:         3
      Size of section headers:           64 (bytes)
      Number of section headers:         9
      Section header string table index: 7
    Section Headers:
      [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
      [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0 
       .....................................................................................
      [ 8] .strtab           STRTAB          0000000000000000 00529b 000119 00      0   0  1
    Key to Flags:
      W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
      L (link order), O (extra OS processing required), G (group), T (TLS),
      C (compressed), x (unknown), o (OS specific), E (exclude),
      D (mbind), p (processor specific)
    ......................
    
    ===== [elf heard infos] in ascendxxx_matmul_leakyrelu_custom_MatmulLeakyreluCustom_e052bee3255764ac919095f3bdf83389.o =====:
    ELF Header:
      Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
      Class:                             ELF64
      Data:                              2's complement, little endian
      Version:                           1 (current)
      .............................................
      Section header string table index: 6
    Section Headers:
      [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
      [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
      [ 1] .text             PROGBITS        0000000000000000 0000e8 007ed8 00  AX  0   0  4
      [ 2] .data             PROGBITS        0000000000008000 0080e8 000008 00  WA  0   0 256
      [ 3] .comment          PROGBITS        0000000000000000 0080f0 000043 01  MS  0   0  1
      [ 4] .bl_uninit        NOBITS          0000000000000000 008133 000020 00      0   0  1
      [ 5] .symtab           SYMTAB          0000000000000000 008138 0000c0 18      7   1  8
      [ 6] .shstrtab         STRTAB          0000000000000000 0081f8 00003b 00      0   0  1
      [ 7] .strtab           STRTAB          0000000000000000 008233 0000ec 00      0   0  1
    ..................................
    
  • Decompressing each device.o file and flushing it to the disk:
    1
    msobjdump --extract-elf ${cmake_install_dir}/op_api/lib/libcust_opapi.so 
    

    After this command is executed, the decompressed file is saved in the current execution path by default. The result directory is as follows:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    |-- config                                                               // Directory of the operator prototype configuration file
    |    ├── ${soc_version}   
    |        ├── acos_custom.json                                  
    |        ├── matmul_leakyrelu_custom.json                               
    |        ├── .......                                             
    |-- ${soc_version}                                                     // Ascend AI Processor name
    |     ├── acos_custom                                               // Basic single-operator compilation file (*.o) and the corresponding *.json file
    |         ├── AcosCustom_da824ede53d7e754f85c14b9446ec2fc.json      // Naming rules: ${op_type}_${parm_info}.json or ${op_type}_${parm_info}.o, where ${parm_info} is the identifier generated based on the operator input/output information such as the data type and shape.
    |         ├── AcosCustom_da824ede53d7e754f85c14b9446ec2fc.o
    |         ├── AcosCustom_dad9c8ca8fcbfd789010c8b1c0da8e26.json
    |         ├── AcosCustom_dad9c8ca8fcbfd789010c8b1c0da8e26.o
    |     ├── matmul_leakyrelu_custom  
    |         ├── MatmulLeakyreluCustom_e052bee3255764ac919095f3bdf83389.json
    |         ├── MatmulLeakyreluCustom_e052bee3255764ac919095f3bdf83389.o
    |     ├── axpy_custom    
    |         ├── .....
    

    The following uses the decompression of the acos_custom operator compilation result as an example:

    • View the operator prototype (acos_custom.json).
      {
          "binList": [
              {
                  "implMode": "high_performance",
                  "int64Mode": false,
                  "simplifiedKeyMode": 0,
                  "simplifiedKey": [......],
                  "staticKey": "96b2b4bb2e3xxx,ee37ce8796ef139dexxxx",
                  "inputs": [
                      {
                          "name": "x",
                          "index": 0,
                          "dtype": "float32",
                          "format": "ND",
                          "paramType": "required",
                          "shape": [
                              -2
                          ],
                          "format_match_mode": "FormatAgnostic"
                      }
                  ],
                  "outputs": [
                      {
                          "name": "y",
                          "index": 0,
                          "dtype": "float32",
                          "format": "ND",
                          "paramType": "required",
                          "shape": [
                              -2
                          ],
                          "format_match_mode": "FormatAgnostic"
                      }
                  ],
                  "attrs": [
                      {
                          "name": "tmp",
                          "dtype": "int",
                          "value": 0
                      },
                      .........
                  ],
                  "opMode": "dynamic",
                  "optionalInputMode": "gen_placeholder",
                  "deterministic": "ignore",
                  "binInfo": {
                      "jsonFilePath": "ascendxxx/acos_custom/AcosCustom_da824ede53d7e754f85c14b9446ec2fc.json"
                  }
              },
              {
                  "implMode": "high_performance",
                  "int64Mode": false,
                  "simplifiedKeyMode": 0,
                  "simplifiedKey": [
        
                  ],
                  "staticKey": "27d6f997f2f3551axxxx,1385590c47affa578eb429xxx",
                  "inputs": [
                      {
                          "name": "x",
                          "index": 0,
                          "dtype": "float16",
                          "format": "ND",
                          "paramType": "required",
                          "shape": [
                              -2
                          ],
                          "format_match_mode": "FormatAgnostic"
                      }
                  ],
                  "outputs": [
                      {
                          "name": "y",
                          "index": 0,
                          "dtype": "float16",
                          "format": "ND",
                          "paramType": "required",
                          "shape": [
                              -2
                          ],
                          "format_match_mode": "FormatAgnostic"
                      }
                  ],
                  "attrs": [
                      {
                          "name": "tmp",
                          "dtype": "int",
                          "value": 0
                      },
                      .........
                  ],
                  "opMode": "dynamic",
                  "optionalInputMode": "gen_placeholder",
                  "deterministic": "ignore",
                  "binInfo": {
                      "jsonFilePath": "ascendxxx/acos_custom/AcosCustom_dad9c8ca8fcbfd789010c8b1c0da8e26.json"
                  }
              }
          ]
      }
    • To parse ${op_type}_${parm_info}.o to obtain the .ascend.meta section information, run the following command:
      1
      msobjdump --dump-elf ./AcosCustom_da824ede53d7e754f85c14b9446ec2fc.o
      

      After this command is executed, the device screen displays the following information. For details about the parameters, see Table 3, Table 4, and Table 5.

       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      .ascend.meta. [0]: AcosCustom_da824ede53d7e754f85c14b9446ec2fc_1
      T: 1  L: 4  V: 3
      F_TYPE_KTYPE: K_TYPE_AIV
      .ascend.meta. [0]: AcosCustom_da824ede53d7e754f85c14b9446ec2fc_2_mix_aiv
      T: 1  L: 4  V: 5
      F_TYPE_KTYPE: K_TYPE_MIX_AIV_MAIN
      T: 3  L: 4  V: [0:1]
      F_TYPE_MIX_TASK_RATION: [0:1]
      .ascend.meta. [0]: AcosCustom_da824ede53d7e754f85c14b9446ec2fc_3_mix_aiv
      T: 1  L: 4  V: 5
      F_TYPE_KTYPE: K_TYPE_MIX_AIV_MAIN
      T: 3  L: 4  V: [0:1]
      F_TYPE_MIX_TASK_RATION: [0:1]
      
      Table 3 Type mapping

      Type

      Parsing Type Mapping

      Description

      1

      F_TYPE_KTYPE

      Kernel type.

      2

      F_TYPE_CROSS_CORE_SYNC

      Hardware synchronization (syncall) type.

      3

      F_TYPE_MIX_TASK_RATION

      Core allocation type when the kernel function is running.

      Table 4 Kernel type mapping

      Value of the Kernel Type

      Kernel Type

      Description

      1

      K_TYPE_AICORE

      This parameter is reserved and is not supported in the current version.

      Only the AI Cores are started during operator execution. For example, if blockdim is set to 5 on the host, 5 AI Cores are started.

      2

      K_TYPE_AIC

      Only the Cube Cores on the AI Cores are started during operator execution. For example, if blockdim is set to 10 on the host, 10 Cube Cores are started.

      3

      K_TYPE_AIV

      Only the Vector Cores on the AI Cores are started during operator execution. For example, if blockdim is set to 10 on the host, 10 Vector Cores are started.

      4

      K_TYPE_MIX_AIC_MAIN

      In the scenario of mixing AIC and AIV, if the kernel function type is set to MIX, the Cube and Vector Cores on the AI Cores are started at the same time during operator execution. For example, you can set blockdim to 10 and task_ration to 1:2 on the host. In this case, 10 Cube Cores and 20 Vector Cores are started.

      5

      K_TYPE_MIX_AIV_MAIN

      In the scenario of mixing AIC and AIV, when multi-core control instructions are used and the kernel function type is set to MIX, the Cube and Vector Cores on the AI Cores are started at the same time during operator execution. For example, you can set blockdim to 10 and task_ration to 1:2 on the host. In this case, 10 Vector Cores and 20 Cube Cores are started.

      6

      K_TYPE_AIC_ROLLBACK

      When the operator is executed, the AI Cores and Vector Cores are started at the same time. In this case, the AI Cores are used as the Cube Cores.

      7

      K_TYPE_AIV_ROLLBACK

      When the operator is executed, the AI Cores and Vector Cores are started at the same time. In this case, the AI Cores are used as the Vector Cores.

      Table 5 Hardware synchronization mapping

      Value

      Type

      Description

      1

      C_TYPE_USE_SYNC

      Hardware synchronization is used.

      0

      C_TYPE_NO_USE_SYNC

      Hardware synchronization is not used.

    • View ${op_type}_${parm_info}.json to obtain the device.o operator information.
      {
          "binFileName": "AcosCustom_da824ede53d7e754f85c14b9446ec2fc",
          "binFileSuffix": ".o",
          "blockDim": -1,
          "coreType": "MIX",
          "intercoreSync": 1,
          "kernelName": "AcosCustom_da824ede53d7e754f85c14b9446ec2fc",
          "magic": "RT_DEV_BINARY_MAGIC_ELF",
          "memoryStamping": [],
          "opParaSize": 24,
          "parameters": [],
          "sha256": "94e32d04fcaf435411xxxxxxxx",
          "workspace": {
              "num": 1,
              "size": [
                  -1
              ],
              "type": [
                  0
              ]
          },
          "kernelList": [
              {
                  "tilingKey": 1,
                  "kernelType": "MIX_AIC",
                  "taskRation": "0:1",
                  "crossCoreSync": 0,
                  "kernelName": "AcosCustom_da824ede53d7e754f85c14b9446ec2fc_1"
              },
              .........
          ],
          "taskRation": "tilingKey",
          "optionalInputMode": "gen_placeholder",
          "debugOptions": "printf",
          "debugBufSize": 78643200,
          "compileInfo": {},
          "supportInfo": {                                                        // Operator prototype information
              "implMode": "high_performance",
              "int64Mode": false,
              "simplifiedKeyMode": 0,
              "simplifiedKey": [......],
              "staticKey": "96b2b4bb2e35fa3dxxx,ee37ce8796ef139dedxxxxxxxx",
              "inputs": [
                  {
                      "name": "x",
                      "index": 0,
                      "dtype": "float32",
                      "format": "ND",
                      "paramType": "required",
                      "shape": [
                          -2
                      ],
                      "format_match_mode": "FormatAgnostic"
                  }
              ],
              "outputs": [
                  {
                      "name": "y",
                      "index": 0,
                      "dtype": "float32",
                      "format": "ND",
                      "paramType": "required",
                      "shape": [
                          -2
                      ],
                      "format_match_mode": "FormatAgnostic"
                  }
              ],
              "attrs": [
                  {
                      "name": "tmp",
                      "dtype": "int",
                      "value": 0
                  },
                  .........
              ],
              "opMode": "dynamic",
              "optionalInputMode": "gen_placeholder",
              "deterministic": "ignore"
          },
          "filePath": "ascendxxx/acos_custom/AcosCustom_da824ede53d7e754f85c14b9446ec2fc.json"
      }
  • Obtaining the list of device.o files:
    1
    msobjdump --list-elf ${cmake_install_dir}/op_api/lib/libcust_opapi.so 
    

    After this command is executed, all files are displayed on the device screen, as shown in the following:

    1
    2
    3
    4
    5
    ELF file    0: ascendxxx_acos_custom_AcosCustom_dad9c8ca8fcbfd789010c8b1c0da8e26.json
    ELF file    1: ascendxxx_acos_custom_AcosCustom_dad9c8ca8fcbfd789010c8b1c0da8e26.o
    ....................
    ELF file    2: ascendxxx_acos_custom_AcosCustom_da824ede53d7e754f85c14b9446ec2fc.json
    ELF file    3: ascendxxx_acos_custom_AcosCustom_da824ede53d7e754f85c14b9446ec2fc.o