Viewing the Operator Simulation Pipeline

The msOpGen tool parses dump files generated by users, and generates operator simulation pipeline files (trace.json).

  1. Run the install.sh file in the ${git_clone_path}/samples/operator/ascendc/0_introduction/1_add_frameworklaunch directory to generate the CustomOp folder. For details, see Link.

    This sample project does not support Atlas A3 Training Series Product and Atlas Training Series Product .

    ./install.sh -v Ascendxxxyy    # xxxyy indicates the processor type.
  2. Build the operator project.
    1. Complete build configurations by referring to Preparations.
    2. Run the following command in the CustomOp operator project directory to build the operator project:

      To generate an operator simulation pipeline, change the value of CMAKE_BUILD_TYPE in the CMakePresets.json file in the current directory to Debug.

      After the build is complete, the .run operator package is generated in the build_out directory.
      ./build.sh
  3. In the directory where the custom operator package is stored, run the following command to deploy the operator package:
    ./build_out/custom_opp_<target_os>_<target_architecture>.run
  4. Switch to the ${git_clone_path}/samples/operator/ascendc/0_introduction/1_add_frameworklaunch/AclNNInvocation directory of the AclNNInvocation repository and run the following command:
    ./run.sh
  5. After the environment variables are enabled, perform simulation by referring to msprof op simulator function and generate dump data.
    export LD_LIBRARY_PATH=${git_clone_path}/samples/operator/ascendc/0_introduction/1_add_frameworklaunch/CustomOp/build_out/op_host/:$LD_LIBRARY_PATH
  6. Generate an operator simulation pipeline file.

    Run the following command. For details about the parameters, see Table 1.

    msopgen sim -c core{id} -d xx/{path of dump data} -subc {sub core id} -out {output path} -reloc {path of .o file or executable file} 
    Table 1 Parameters

    Parameter

    Description

    Required

    sim

    Used for operations related to performance simulation.

    NOTE:

    The msopgen sim command will be unavailable in the next MindStudio version. After that, you can use the simulation capability provided by msOpProf instead. For details, see Tool Usage.

    Yes

    -c, --core-id

    Core ID.

    Processor ID, for example, core0.

    Yes

    -d, --dump-dir

    Path to the dump file. The path can be either absolute or relative.

    Yes

    -subc, --subcore_id

    Subcore ID. A single subcore can be displayed.

    If the dump file name contains veccore{id} or cubecore{id}, set this option to specify the dump file to be parsed. For example, if the file name is core0.veccore0.instr_log.dump, veccore0 is the subcore ID.

    Select one from the two.

    NOTE:

    This parameter needs to be set only for the Atlas A3 training products / Atlas A3 inference products and Atlas A2 training products / Atlas A2 inference products .

    -mix, --mixcore-mode

    Displays the Mix fused operator.

    -reloc, --relocatable-file

    Sets the value to the path of the .o file or executable file generated after operator compilation on the kernel.

    Maps the pipeline to the code line and generates a .csv file indicating time consumption of code lines and instructions.

    NOTE:

    An .o file that contains debugging information generated during operator project build (in the ${git_clone_path}/samples/operator/ascendc/0_introduction/1_add_frameworklaunch/CustomOp/build_out/op_kernel/binary/ascendxxxy/add_custom/AddCustom_*.o directory). Change CMAKE_BUILD_TYPE in CMakePresets.json to Debug. For details, see build procedure.

    No

    -out, --output

    Output path. The path can be either absolute or relative. The user who runs the tool must have the read and write permissions on the path.

    Yes

    -h, --help

    Outputs the help information.

    No

    Run the following commands:

    Example 1:
    msopgen sim -c core0 -d xx/{model}/ca/add_custom/add_custom_pre_static_add_custom -out ./output_data -subc cubecore0 -reloc xx/.o
    • -c specifies the core ID of the dump data file to be parsed, for example, core0.
    • -d specifies the path of the dump data file generated in the performance simulation environment, for example, {model}/ca/add_custom/add_custom_pre_static_add_custom.
    • -subc specifies the subcore ID of the dump file to be parsed. For example, if the file name is core0.cubecore0.instr_log.dump, cubecore0 is the subcore ID. (This parameter needs to be set only for the Atlas A3 training products / Atlas A3 inference products and Atlas A2 training products / Atlas A2 inference products .)
    • -reloc sets the value to the path of the .o file or executable file generated after operator compilation on the kernel.
    Example 2:
    msopgen sim -c core0 -d xx/{model}/ca/add_custom/add_custom_pre_static_add_custom -out ./output_data -mix
    • -c specifies the core ID of the dump data file to be parsed, for example, core0.
    • -d specifies the path of the dump data file generated in the performance simulation environment, for example, {model}/ca/add_custom/add_custom_pre_static_add_custom.
    • -mix indicates that the Mix fused operator is displayed.
  7. View the operator simulation pipeline file.

    You can enter chrome://tracing in the address box of the Chrome browser, drag the dump2trace_core*.json file from the output path to the blank area, and press the shortcut keys (w: zoom in; s: zoom out; a: move left; d: move right) on the keyboard to view the file, as shown in the following figure.

    Figure 1 Display of a single subcore
    Figure 2 Display of the Mix fused operator
    Table 2 Field description

    Field

    Description

    VECTOR

    Vector unit.

    SCALAR

    Scalar unit.

    CUBE

    Cube unit.

    MTE1

    Data transfer flow. The transfer direction is L1 -> {L0A/L0B, UBUF}.

    MTE2

    Data transfer flow. The transfer direction is {DDR/GM, L2} -> {L1, L0A/B, UBUF}.

    MTE3

    Data transfer flow. The transfer direction is UBUF -> {DDR/GM, L2, L1}.

    FIXP

    Data transfer pipeline, from FixPipe L0C to OUT/L1. (Only displayed for the Atlas A3 training products / Atlas A3 inference products and Atlas A2 training products / Atlas A2 inference products .)

    FLOWCTRL

    Control flow instruction.

    ICmiss

    Missed ICache.

  8. View the time-consuming file of code lines or instructions.
    Open the time-consuming file of code lines {core ID}_code_exe_prof.csv in the output path.
    Figure 3 Time-consuming file of code lines
    Open the time-consuming file of instructions {core ID}_instr_exe_prof.csv in the output path.
    Figure 4 Time-consuming file of instructions

    You can view the call count and cycles fields in the file to view the number of times that a code line or instruction is called and the accumulated duration.