Instructions
Displaying the Parallel Policy
The Summary tab page supports the management of parallel strategy settings, which can be distinguished based on the imported profile data.
- If the profile data contains the collected parallel strategy parameter values, the values can be automatically read and filled in the page. The page information is automatically updated based on the input values. If you need to reset the parallel parameter values, enter the correct values and click Generate. In the dialog box that is displayed, confirm the information and click Confirm. The page information is updated accordingly.
- If the profile data does not contain the collected parallel strategy parameter values, enter the correct values of PP Size, TP Size, CP Size, DP Size, MoE-TP Size, and EP Size as required and click Generate. The page information is updated accordingly.
Configure the parallel strategy as follows: PP Size = 4, TP Size = 4, CP Size = 4, DP Size = 8, and EP Size = 4. Click Generate. The parallel strategy graph is updated based on the input values, as shown in Figure 1.
When selecting different dimensions, you can select Pipeline Parallel, Tensor Parallel, Context Parallel, Data Parallel, or Expert Parallel as required. The parallel strategy graph displays the division policy check box based on the selected options. When you click the check box, the page below is updated accordingly.
You can also select Performance Metric and Visible Range to color the target in the parallel strategy graph. Select Target Index and click Find to quickly locate the target index.
You can set the selected performance metric of any target index as the minimum or maximum filter value to quickly locate and analyze issues. In all dimensions, select a performance metric, right-click a target index in the parallel strategy graph, and choose the minimum or maximum filter value setting from the shortcut menu. The rendering color of the graph and the filter range change accordingly.
- Rule for setting the parallel strategy: Parallel strategy value = PP Size × TP Size × CP Size × DP Size ≥ Number of cards imported.
- If data that has been imported is imported to MindStudio Insight, the parallel strategy value is remembered and the previously set parallel strategy value is displayed by default.
Supporting Page Information Linkage
- Flow linkage
After the parallel strategy is set, if DP + PP + CP + TP is selected, you can click the target index in the strategy graph display area to show the related flow. When you click the flow, the lower part of the page changes accordingly, as shown in Figure 2. This function facilitates developers to view data differences.
You can also click the target index in the DP + PP + TP or DP + PP + CP + TP dimension to display flows. Right-click a flow and choose the option for viewing communication duration analysis. Then, the communication page is displayed, showing details about the communication group to which the target index belongs.
In the DP + PP + CP + TP area, click the tensor parallel flow related to rank 0 in the strategy graph. The Computation/Communication Overview, Computing Detail (Rank ID), and Communication Detail (Rank ID) areas are updated. Computation/Communications Overview displays details about the communication groups (0, 1, 2, 3) related to rank 0. Computing Detail (Rank ID) and Communication Detail (Rank ID) display the computing details and communication details of the corresponding card, respectively. When you click the bar chart of any card in the Computation/Communication Overview area, the computing details and communication details of the corresponding card are displayed.
- Box selection linkage
Select any dimension. When Pipeline Parallel, Tensor Parallel, Context Parallel, or Data Parallel is selected, the parallel strategy graph displays the division strategy according to the selected option. The box selection area is displayed. Click the box, and the lower part of the page changes accordingly, as shown in Figure 3.
Displaying Parallel Policies in Different Dimensions
On the Summary page, after the parallel strategy value is set, you can select DP + PP, DP + PP + CP, DP + PP + TP, or DP + PP + CP + TP to display the parallel strategy graph.
You can select a dimension tab on the parallel strategy graph to expand the corresponding dimension, or right-click a target index to expand or collapse each dimension.
- Expand: In the DP + PP or DP + PP + CP dimension, right-click a target index and choose Expand from the shortcut menu.
- Collapse: In the DP + PP, DP + PP + CP, DP + PP + TP, or DP + PP + CP + TP dimension, right-click a target index and choose Collapse from the shortcut menu.
When CP Size is set to 1, the DP + PP and DP + PP + TP parallel dimensions are displayed, and Context Parallel Size is not displayed in each dimension.
The details of each dimension are as follows.
- DP + PP Dimension
If the DP + PP parallel dimension is selected, you can select Pipeline Parallel and Data Parallel. When you click a box in the strategy graph, the Computation/Communication Overview bar chart changes accordingly. When you select performance metrics, the strategy graph is rendered accordingly to facilitate analysis, as shown in Figure 4. You can set the visible range corresponding to a performance metric and enter the required index in the Target Index text box to accurately locate the target.
You can click the icon of a data type on the top of the bar chart to hide or display the corresponding data in the bar chart.
- DP + PP + CP Dimension
When Algorithm is set to Megatron-LM (tp-cp-ep-dp-pp), Megatron-LM (tp-cp-pp-ep-dp), or MindSpeed (tp-cp-ep-dp-pp), the DP + PP + CP parallel dimension is displayed. You can select Pipeline Parallelism, Context Parallelism, and Data Parallelism. When you click a box in the strategy graph, the Computation/Communication Overview bar chart changes accordingly. When you select performance metrics, the strategy graph is rendered accordingly to facilitate analysis, as shown in Figure 5. You can set the visible range corresponding to a performance metric and enter the required index in the Target Index text box to accurately locate the target.
You can click the icon of a data type on the top of the bar chart to hide or display the corresponding data in the bar chart.
- DP + PP + TP dimension
When Algorithm is set to MindIE-LLM (tp-dp-ep-pp-moetp) or vLLM (tp-pp-dp-ep), the DP + PP + TP parallel dimension is displayed. You can select Pipeline Parallelism, Tensor Parallelism, Data Parallelism, and Expert Parallelism. When you click a box in the strategy graph, the Computation/Communication Overview bar chart changes accordingly. When you select performance metrics, the strategy graph is rendered accordingly to facilitate analysis, as shown in Figure 6. You can set the visible range corresponding to a performance metric and enter the required index in the Target Index text box to accurately locate the target.
You can click a card and select the corresponding flow to display Computation/Communication Overview, Computing Detail, and Communication Detail under the strategy graph. You can also click a data type icon on the top of the bar chart to hide or display the data in the bar chart.
- DP + PP + CP + TP dimension
When Algorithm is set to Megatron-LM (tp-cp-ep-dp-pp), Megatron-LM (tp-cp-pp-ep-dp), or MindSpeed (tp-cp-ep-dp-pp), the DP + PP + CP + TP parallel dimension is displayed. You can select Pipeline Parallelism, Tensor Parallelism, Context Parallelism, or Data Parallelism. When you click a box in the strategy graph, the Computation/Communication Overview bar chart changes accordingly. When you select performance metrics, the strategy graph is rendered accordingly to facilitate analysis, as shown in Figure 7. You can set the visible range corresponding to a performance metric and enter the required index in the Target Index text box to accurately locate the target.
You can click a card and select the corresponding flow to display the computation/communication overview, computing details, and communication details under the policy graph. You can also click the corresponding data type icon on the top of the bar chart to hide or display the corresponding data in the bar chart.
Comparing Cluster Data
MindStudio Insight allows developers to compare cluster data to intuitively view data differences. For details about how to set baseline data and comparison data, see Data Comparison.
In comparison mode, the Base Info area on the Summary tab page displays the comparison data and baseline data information.
In the Parallel Strategy Analysis area, the parallel strategy configuration parameters must be set according to the rules. The number of imported cards is determined by the maximum number of devices in the comparison data or baseline data. When you select the target index in the parallel strategy graph, the details displayed indicate the comparison data and the data in the brackets indicates the difference.
In the bar chart details in the Computation/Communication Overview area, the difference between the comparison data and baseline data is displayed, as shown in Figure 8.
Showing Expert Distribution Heatmap and Load Balancing Heatmap
In the MoE Expert Load Balancing Analysis area, you can choose to display the expert distribution heatmap and the expert load balancing heatmap.
- Expert distribution heatmap
If the imported profile data contains expert distribution heatmap data, set Data Version to Profiling, set other related parameters, and click Search. The expert distribution heatmap is displayed.
- Expert load balancing heatmap
To import the balanced or unbalanced dump data, set Data Version to Dump balanced or Dump unbalanced and click
to import the corresponding file to display the expert load balancing heatmap of the MoE model is displayed (as shown in Figure 9). After the file is imported successfully, the default values of the parameters are automatically set.
The vertical coordinate indicates the total number of model layers (MoE layers + non-MoE layers), and the horizontal coordinate indicates the expert sequence number. When you select a cell in the chart, the details of the cell are displayed, including the expert index, ID, layer ID, rank ID, and access traffic.
You can hold down Ctrl and scroll the mouse wheel to zoom in or out the heatmap.








