Communication and Computing Pipeline Chart

After you use msprof op to tune communication and compute fused operators, the generated trace.json and visualize_data.bin files can be visualized using MindStudio Insight. Information such as the communication and compute status and instruction duration can be displayed, helping developers identify bottlenecks. Currently, only MC2 and LCCL fused operators are supported.

  • To use MindStudio Insight, you need to install the MindStudio Insight software package separately. For details about the download link, see"Installation and Uninstallation".
  • For details about how to import the visualize_data.bin file to MindStudio Insight, see Importing Profile Data.
  • For details about the MindStudio Insight operations and fields, see ""System Tuning" > "Timeline"" in MindStudio Insight User Guide.
  • If the -g compilation option is added, the generated binary file contains debugging information. You are advised to restrict the access permission of user programs with debugging information to ensure that only authorized personnel can access the binary file.
  • Google Chrome

    Enter the chrome://tracing address in the address box of Google Chrome, drag the communication and computing pipeline file (trace.json) generated by msprof op to the blank area, and press the shortcut keys on the keyboard (W: zoom in; S: zoom out; A: move left; D: move right) to view the file. See Table 1 for more details.

    Table 1 Key fields

    Field

    Field Function

    MC2 Operator

    LCCL Operator

    AI CORE

    Overall running status of the operators on the AI Core.

    Supported

    Supported

    AI CPU

    Overall running status of the operators on the AI CPU.

    Supported

    Not supported

    TURN

    Pipeline of the operators on the AI CPU at different communication rounds.

    Supported

    Not supported

    AIC BLOCK

    Overall running status and key API call status of the operators on each Cube core of the AI Core.

    Supported

    Supported

    AIV BLOCK

    Overall running status and key API call status of the operators on each Vector core of the AI Core.

    Supported

    Supported

    HCCL

    Multi-device collective communication pipeline of operators using HCCL.

    Supported

    Not supported

    HCCL TASK

    Multi-device collective communication task execution pipeline of operators using HCCL.

    Supported

    Not supported

  • MindStudio Insight
    The trace.json or visualize_data.bin file generated by msprof op can be imported to MindStudio Insight to display.
    Figure 1 Communication and computing pipeline chart

    • Displays the time overlap of operators on the AI CPU and AI Core, which is used to evaluate the performance of the fused operators.
    • Displays the pipeline of operators on the AI CPU at different communication rounds.
    • Displays the running time of operators on each block and the key API call pipeline.
    • Displays the multi-device collective communication pipeline and multi-device collective communication task pipeline of operators using HCCL.
      • The MC2 operators can call the AllReduce, AllGather, ReduceScatter, and AlltoAll APIs of Atlas A2 training products / Atlas A2 inference products and the AllGather, ReduceScatter, and AlltoAllV APIs of Atlas A3 training products / Atlas A3 inference products . For details, see "High-level APIs" > "HCCL" > "HCCL" in . After the -g compilation option is added, clicking a specific API will display the call stack of the code line.
      • For details of MC2 and LCCL operator support, see Table 1.