Instruction Pipeline Chart

It displays timing relationship by instruction and associates with the call stack to quickly trace bottlenecks.

  • If the -g compilation option is added, the generated binary file contains debugging information. You are advised to restrict the access permission of user programs with debugging information to ensure that only authorized personnel can access the binary file.
  • If the functions provided by the llvm-symbolizer component are not used, do not include -g when compiling the program that is input to msProf. In this case, the msProf tool does not call the functions of the llvm-symbolizer component.
  • For performance of some operators, call TRACE_START and TRACE_STOP in the single core of the and . Add -DASCENDC_TRACE_ON to the compilation configuration file. For details, see adding -DASCENDC_TRACE_ON. Then, the system can generate the pipeline chart. For details about the flow chart content, see Instruction Pipeline Chart.
  • You need to add -DASCENDC_TRACE_ON to the compilation configuration file. For details, see the following sample project.
    For AddKernelInvocationNeo operator project, add the following code to the ${git_clone_path}/samples/operator/ascendc/0_introduction/3_add_kernellaunch/AddKernelInvocationNeo/cmake/npu_lib.cmake file:
    1
    2
    3
    4
    5
    ascendc_compile_definitions
    (
        ...
        -DASCENDC_TRACE_ON
    )
    
  • Google Chrome

    Enter the chrome://tracing address in the address box of Google Chrome, drag the instruction pipeline file (trace.json) generated in Tool Usage to the blank area, and press the shortcut keys on the keyboard (W: zoom in; S: zoom out; A: move left; D: move right) to view the file. See Table 1 for more details.

    Table 1 Key fields

    Field

    Description

    VECTOR

    Vector unit.

    SCALAR

    Scalar unit.

    CUBE

    Cube unit.

    MTE1

    Data transfer flow. The transfer direction is L1 -> {L0A/L0B, UBUF}.

    MTE2

    Data transfer flow. The transfer direction is {DDR/GM, L2} -> {L1, L0A/B, UBUF}.

    MTE3

    Data transfer pipeline, from UBUF to {DDR/GM, L2, L1}, or from L1 to {DDR/L2}.

    FLOWCTRL

    Control flow instruction.

    CACHEMISS

    ICache that is not hit.

    USEMASK

    Custom dotting range.

    ALL

    Instructions in this channel will be executed in all channels.